An insert cannot be idempotent by the very definition of the operation.

joncrocks · on March 1, 2017

It can be as long as you have some kind of identity information associated with the request, and provided alongside any requests that are the 'same'.

e.g. Instead of saying "Insert a transaction for £20", you say, "Insert a transaction that entity A calls 1234, for £20".

Stripe use something like this to ensure that as long as you call their API to ask for a payment and use the same key, they won't charge the person twice.

https://stripe.com/docs/api#idempotent_requests

naasking · on March 1, 2017

Requesting an insert can be idempotent. Each non-idempotent operation is tagged with a unique identifier, if the server doesn't have inserted data tagged under that identifier, then it performs the insert, if it does, then it returns the usual success code as if it had just performed the insert. This unique identifier is simply a durable representation of a future which I described above.

int_19h · on March 1, 2017

This would seem to require the identifiers to be attached to records in perpetuity (i.e. it would basically require client-generated IDs everywhere), so that the server can reliably verify that this operation has already produced a record. I can see it working, but it's a far-reaching change, that may not be easy to adopt for existing data storage schemas.

naasking · on March 1, 2017

> This would seem to require the identifiers to be attached to records in perpetuity (i.e. it would basically require client-generated IDs everywhere)

Not sure what you mean by client-generated. The server-side app generates the id because it has to store and interpret it. This can be as simple as a sequence number, similar to how TCP guarantees delivery. It depends on the schema really.

This is the simplest way to ensure you can perform arbitrary retries of POST in case of a network partition.

This data doesn't even need to be integrated with your app's schema, although that's ideal so you can manage the storage lifetime. But you could use an entirely separate store for GUIDs and cached replies, and so it becomes transparent to your app and just becomes another layer.

int_19h · on March 1, 2017

> Not sure what you mean by client-generated. The server-side app generates the id because it has to store and interpret it.

Per the description above:

"Each non-idempotent operation is tagged with a unique identifier"

Since the operation originates on the client, the client has to tag it with the identifier, no? And the server has to store this identifier in a way that associates it with any data affected by that operation in a non-idempotent way.

Or are you saying that the client first has to make a round-trip to the server to generate the ID, and then use that server-provided ID for the actual POST?

naasking · on March 2, 2017

> Or are you saying that the client first has to make a round-trip to the server to generate the ID, and then use that server-provided ID for the actual POST?

This is always the case for REST given HATEOAS, ie. you've already made some hypermedia requests to obtain the URL of the endpoint to which you will POST.

Unless the resource you're posting to actually is the public entry point of your service, but that would be very unusual.

int_19h · on March 2, 2017

Wait, but what about the request used to obtain the URL of endpoint to which you're posting? Isn't that one then not idempotent (since it would create new URLs every time)?

Generally speaking, what is the flow like? Suppose I allocated myself an endpoint, but then never posted anything to it - what does the endpoint actually contain then, if queried? Do unused ones get "garbage collected" somehow eventually?

naasking · on March 2, 2017

> Wait, but what about the request used to obtain the URL of endpoint to which you're posting? Isn't that one then not idempotent (since it would create new URLs every time)?

Not necessarily. The numbers don't have to be stored before they're actually used by clients. For instance, you could return a simple integer, like an object version #, and an HMAC(integer, resource URL) to ensure the client can't tamper with it. You only store data under that integer when the user successfully POSTs to that resource for the first time.

> Generally speaking, what is the flow like? Suppose I allocated myself an endpoint, but then never posted anything to it - what does the endpoint actually contain then, if queried?

There are many possible designs here. I prefer something like this in CRUD-like contexts [1], because it gets me a fully auditable change history, optimistic concurrency control and all the storage needed is fully integrated into the app schema in a sensible way.

Combined with the HMAC described above, you don't need to preemptively allocate any storage for idempotent POSTS and you can use simple integers as your unique identifiers. So a POST against a version number that's not the latest version would simply return the version that followed if their POST was the one that succeeded, or a redirect to the latest version if someone else had updated it.

Hopefully you can imagine various relaxations on this to make other trade offs. For instance, a much simpler approach would be to return GUIDs for each possible POST, and you just store each GUID associated with the object when a POST succeeds. If the object's current GUID is the GUID the user is posting, return the current data with 200 OK, otherwise return a 301 Moved to a URL referencing the latest version and let the client try to apply their updates to that. The storage can be reclaimed after a suitable period of time; say a month if your app is for browser clients

[1] https://higherlogics.blogspot.ca/2015/10/versioning-domain-e...