Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The `edits` endpoint using the `text-davinci-edit-001` model already does this, and does not seem to allow prompt injection through the input text.

API docs: https://beta.openai.com/docs/api-reference/edits/create

Guide: https://beta.openai.com/docs/guides/completion/editing-text

Edit: It does not seem to protect against injection.



> Edit: It does not seem to protect against injection.

My guess is that in the current implementation of the edits endpoint, the two inputs are being in some way intermingled under the hood (perhaps being concatenated in some way, along with OpenAI-designed prompt sections in between). So the Harvard Architecture Language Model approach should still work once implemented with true separation of inputs.

To ensure the two token streams are not accidentally comingled, my recommendation is that the Trusted and Untrusted inputs should use completely incompatible token dictionaries, so that intermingling them isn't possible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: