controlling access to elasticsearch with filtered aliases, nginx and tokens
Lot's of people wonder about controlling access to ElasticSearch, since it doesn't offer any meaningful security. It's been suggested that using filtered alias along with some nginx rewrites would do the trick.
Really the only thing in the middle is nginx and a small authentication layer.
I was happily coding away with this setup until my boss mentioned the restricting users for kibana with filtered aliases blog post to me.
That sure sounded like my approach was not going to work, or at least not provide real security.
After some experimenting, I still haven't found a way to circumvent the security and so I thought I would summarize the approach for feedback and possibly to allow others to perhaps make use of this.
This is pretty much all Elasticsearch with filtered aliases per user. The filters are then used to restict what each user can see. In our case it's not as simple as just grabbing the user's own data, but the general idea is similar. User credentials are also stored in Elasticsearch, however in a different index and also only accessed via filtered alias.
This is running nginxnginx. It is, of course, where the application is served from, but it is also the proxy for Elasticsearch and hosts an authentication layer. If auth succeeds the users Elasticsearch bound requests are rewritten for their alias. The authentication is a different nginx proxy that is triggered via the ngx_http_auth_request_module.
how it works
- user pulls up the fronted and enters their creds
- a call is made to the /auth proxy due to an auth_request rule in nginx
- the proxy listening for the /auth will pull the users crypted password from Elasticsearch and compares it to the crypted version of the provided password
- assuming the password is valid the /auth service builds an expiring authentication token and returns it along with the expiration time to proxy caller
- the frontend service returns the token to the user (via HTTP headers)
- the user will then provide the token, expiration and username via HTTP headers in all subsequent calls, including ElasticSearch queries
- the auth_request service then validates the header before proceeding with proxying any requests to ElasticSearch
- assuming the ticket is valid the user request for dataindex is always rewritten with her index alias and if there is nothing to rewrite, it fails.
The token based auth uses several pieces of data to build the token. It essentially consists of the username, the expiration, and a secret. The token will only validate if the returned username and expiration can be recombined on the server with the secret to match the supplied token. This avoids lots of of round trips to Elasticsearch to recheck the password. The system also allows for renewing of tickets to avoid having to either store the password and re-prompt for it.Essentially providing a valid ticket and asking for a fresh one results in a new ticket.
The shared secret is stored in a different index in Elasticsearch, which allows us to scale the web services horizontally without much struggle. That data is not subject to access via requests, since no rewrites will match that index.
Our code does not perform the rewrite based not on the remote_user variable. Instead we use the username provided with each request, which is passed through the token validation. This seems to make it impossible to leak information and still avoids a more complex proxy layer.
The biggest issue right now is that the aliases will need to point to multiple indexes for the user data. That makes it difficult to either update data or retrieve as specific document, since neither can be done via the filter alias.
I believe augmenting the token layer to check for a document in the real users filtered alias before allowing access (via auth_request) to the real document should be doable, but that is next of the development plan.