This post picks up where my previous post ‘Using Azure Search’ left off.
Azure Search can be manually deployed through the Azure portal. Instructions can be found here https://docs.microsoft.com/en-us/azure/search/search-create-service-portal. This is fairly simple to do, and it is even possible to set up a free version of Azure Search to research or test some ideas. Using the portal is good for a one off deployment but for repetitive deployments it is best to use an automated solution.
There are many ways to deploy Azure Search using an automated approach. Outlined below is an example approach:
- Deploy Azure Search service using ARM (Azure Resource Management) templates.
- Use Azure CLI to get the admin and query keys and store them securely in a key vault.
- Create indexes, indexers and data sources using an admin key in the request header of a REST call.
- Ensure query keys are securely accessible to clients of Azure Search that need to search the service.
ARM templates can be used to deploy Azure resources. This is the same for an Azure Search service. Syntax information for ARM templates can be found here https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-authoring-templates.
After creating the search instance using ARM templates, we next need to create the indexes, indexers and data sources that will be used when searching. Unfortunately, it is not currently possible to create indexes, indexers and data sources using ARM templates. These components need to be created using the REST API as outlined previously in part 2 ‘Using Azure Search‘. This can be done by PowerShell, bash or any other way where it is possible to make REST calls inside of your automated deployment pipeline.
In order to create the indexes, indexers and data sources an api-key is required. This is used as a request header of the REST calls. The admin and query keys are visible in the Azure portal. However, this is not the most optional solution. The query key is required by services using the search service so must be available to each of these services. The admin key is required when creating and deleting components so must be available as part of your deployment pipeline. The most secure way of making these keys available is for them to never be exposed outside of Azure ARM. Therefore, after creating the service, Azure CLI can be used to query Azure Search for its keys, and this can be used to populate a key vault. This secure key vault can then be referenced by an ARM parameters file (for the services using Azure Search) and by the deployment script (for creating the components within the search service).
Azure Search boasts built-in security features and access control to ensure your private content remains private. Encryption at many levels is a fairly recent (2018) addition, with access to read and write operations controlled by different security keys. The search service is also compliant with various security standards. More information on this compliance can be found here https://docs.microsoft.com/en-us/azure/search/search-security-overview#standards-compliance-iso-27001-soc-2-hipaa.
There are two types of security keys that are used to access the search service. These are admin and query keys.
- Admin keys: Provide read/write
- Query keys: Provide only read
For each service 2 admin keys are generated. These can be regenerated at any time. For each service 50 query keys can be created to provide clients with read operation access to your search service. These can also be regenerated.
Encryption comes for free with Azure Search. Microsoft will manage everything from the certificates to the encryption keys. They do not even allow you to turn it off or manage any of the process yourself. The encryption used within Azure Search is based on 256-bit AES encryption.
Encryption in Transit and at REST
All connections to Azure Search are encrypted. Additionally, encryption occurs automatically on all operations. However, if you created an index before 2018, your index will not be encrypted (although all subsequent operations to this index will be) because this is a fairly recent introduction. Microsoft state that there should be no impact on performance or index size due to encryption.
Data Segregation via Security Filters
Filtering the results returned by Azure Search using a security filter is possible. An extra field should be added to the index, which is a collection of strings.
- Add field “group_ids” (any name can be used) as a Collection(Edm.String). Make sure the field has the filterable attribute set to true.
- When creating or merging a document send an array of possible ids that can access this document, e.g “group_ids”: [“group_id1”, “group_id2”]
- Finally, when searching for the results add a filter for “group_ids”, as follows:
More information about security filters can be
found here https://docs.microsoft.com/en-us/azure/search/search-security-trimming-for-azure-search.
Currently there is no solution for full local development using Azure Search. Microsoft do not provide a local version to run in a Docker container, for instance. There is a wrapper that someone has created for SOLR https://hub.docker.com/r/simonedeponti/azuresearchemulator/. But this requires you to run SOLR locally behind this wrapper. This comes with a few drawbacks, such as the complexity of running SOLR just to get up and running. Additionally, a lot of Azure Search functionality is not provided by this emulator and results may be inconsistent with what is expected.
Another option for development is to connect to a development instance of Azure Search. Each developer could have a separate index within the service. This can cause issues depending on your Azure Search subscription limits as you might be limited to a certain number of indexes. Additionally, each developer has to understand how to manage their own index. I would recommend this approach as it is the simplest and will more closely resemble the results returned in production. An alternative solution is to segregate the index based on which user is making requests. This way, developers can share an index. This can be done via security filters, as described in the ‘Security’ section above. But this adds extra complexity that will only be used in your development environment. For both solutions that connect to an instance of Azure Search in the cloud, you are not able to use indexers or data sources if your database is on your local machine.
When comparing Azure Search there are many different solutions and types of solutions to compare. Firstly, I will compare Azure Search to popular database solutions, and then I will compare it to dedicated search solutions, such as Elasticsearch and SOLR.
Azure Search vs Database Solutions
Azure Search is not a replacement for a database solution. It provides a powerful search but should be used in conjunction with a datastore. When comparing Azure Search to a database solution, I will compare the search functionality.
SQLServer and other DBs provide advanced searching via Full-Text Search. This does not provide the linguistic analysis that is provided by Azure Search. An advantage of Full-Text search is that you do not need to use a fully dedicated service such as Azure Search. You also do not need to manage two copies of your data, as you would when using Azure Search. If you do not require advanced linguistic analysis then Full-Text Search would be a more suitable choice. However, if you want your search to work with spelling mistakes, word inflexions, synonyms and other lexical analysis tools then Azure Search provides many advantages and will provide a more powerful search for your needs. Another advantage that Azure Search has over Full-Text Search is that it performs better when combining searches. Using Full-Text Search in combination to normal queries can result in slow performance.
Another commonly used database solution is using LIKE searches when attempting wildcard searches. LIKE searches can often provide poor performance, so if performance is key then Azure Search could be the solution to use. A drawback to using Azure Search is that for simple searching, more results will be returned than expected. This is even stated in the Microsoft documentation. This is due to the fact that advanced analysis is done on search terms so more results can be returned than expected.
General SQL searches can be used to search your data. If you do not require advanced linguistic analysis, searching through multiple languages or any other of the powerful search functionality of Azure Search, then general database queries may be best suited to your needs.
Azure Search vs Dedicated Search Solutions
Dedicated search solutions that could be used instead of Azure Search are Elasticsearch and SOLR. The main advantages of using other dedicated search solutions is that they are more widely used, therefore there are more sources of information. Being more widely used does not necessarily mean that it is the right solution. If you do not want the overhead of running a dedicated solution, then Azure Search could be the correct choice. Everything is controlled by Microsoft providing you with less overhead. Azure Search is also very easy to setup and provides many advanced features, which I have discussed throughout this blog.
One main drawback of using Azure Search is the limits that you hit on each pricing tier. Elasticsearch and SOLR are also easy to get setup and can provide equally powerful search functionality. One main advantage of other search solutions is they are often open source and free to use. This is the case for Elasticsearch, but with any free solution there can be additional costs for things such as plugins, infrastructure and support.
Throughout this blog I have detailed what is Azure Search, the benefits of it and how to use it. Azure Search provides a lot of powerful features, and a lot of them come out of the box. Microsoft also have a free trial subscription to test different configurations and setup to find the best solution. I would recommend doing this, to verify if Azure Search is the right solution for you.
An important point is that it can be hard to find the right configuration for Azure Search. This is because there are many possible options to choose from and many advanced configuration options to consider. It can therefore be very time consuming to find the best approach, but once you have found the most optimal approach it can be very rewarding.
Azure Search is a fascinating technology, especially the linguistic analysis behind how it works. It is partly built on top of Apache Lucene and this is a very interesting area of technology to study.
After using Azure Search I would not recommend using it as a database replacement. Although it can provide search functionality it cannot handle data in the same way as a traditional database. In Microsoft’s documentation it states that Azure Search is rarely a true replacement for data storage. If you are going to use some of the advanced features then this could be the solution for you, but if you only want to perform simple searches then simpler solutions should be investigated.
I hope this blog has provided some insight into how to use Azure Search, and also why to use Azure Search.