Automation of IT resource allocation is the essential foundation for generalizing the principles of continuous development, integration and deployment.
Internal Case Study
With its three divisions (Geospatial, Business, Infrastructure), the Camptocamp group is inherently organized around DevOps methodologies. The main objective of this approach is to align our employees (specialized engineers, business analysts, integrators, testers) with common goals and objectives, work together to produce value, and give satisfaction to our customers. Promoting agility and organizing our teams around automated and transversal processes is at the heart of our concerns.
IT service and process automation cannot be achieved without a flexible infrastructure and programmable interfaces (APIs) that enable on-demand resources (computing power, storage, user and access management) to be created and deleted instantly. This ability to obtain self-service resources is fundamental for a company like ours, which must be able to seize opportunities very quickly by rapidly developing new projects, making prototypes available through the Internet, and continually testing the evolutions proposed (this without relying exclusively on external service providers, by guaranteeing the location and confidentiality of the data processed, and without having to make concessions on security aspects).
OpenStack adoption, history
As early as 2007, Camptocamp pioneered the use of the Amazon AWS services for the deployment of several production projects. This experience has profoundly changed our expectations in terms of resource allocation. We moved from a manual mode, generally slow and static, to an automated mode, almost instantaneous and dynamic. Managing infrastructure resources (Infrastructure-as-Code) in a programmatic way allows the use of proven development methodologies such as keeping track of changes or setting up code review processes prior to changes. A real revolution.
The use of public cloud computing solutions (such as Amazon AWS) poses a number of problems related to data location and the jurisdiction of the companies that provide these services. Many companies are particularly sensitive to these issues and de facto exclude the use of these services or restrict their choice to solutions offered by companies located in a specific country. In Switzerland (and given the size of the market), solutions are scarce and often very limited, years away from what is being offered by the leader in this field, Amazon AWS.
Based on this observation and discovering the emergence of the OpenStack open source project, we decided in 2012 to replace our internal virtualized infrastructure (based at the time on OpenVZ) with a self-service infrastructure based on OpenStack. Three main objectives were driving our approach: to have a private cloud “à la Amazon AWS” in Switzerland for our internal hosting needs, to increase our expertise and to contribute to the development of this very promising open source solution initiated by Rackspace and NASA.
This first experience in 2012 was not easy because the OpenStack project encompasses many aspects that ultimately manage all the components of an infrastructure (network, storage, computing, identity, etc.); it was a significant complexity requiring very many fields of expertise and in particular in relation to the dematerialized network aspects around the principle of software-defined networking (SDN).
Between 2012 and 2016, we managed an OpenStack cluster of about 30 hardware servers (Intel x86) for a maximum allowable capacity of about 300 virtual servers. All of our old virtualized infrastructure has been completely migrated to this new cluster and, since then, we have been following this project and participating in various events (OpenStack Summit or Meetup) in connection with the community of this open source project.
We started in 2012 with the Grizzli version of OpenStack, followed by two successive migrations from Grizzli to Havana and from Havana to Icehouse, respectively. At that time, the migration process was relatively complex and caused us many problems, especially because of the significant changes in the network layer. This update process has fortunately improved over the versions.
2016: new cluster
In 2016, we carried out various consulting missions for several of our clients around the OpenStack solution. Beyond the fact of choosing OpenStack, many of our customers were wondering, just like us, about the choice of the installer. At that time, we wanted to completely re-install our OpenStack cluster in order to take advantage of the major evolutions of the project, to make important changes in terms of hardware and network topology but also to make a clean slate of our past deployments.
Like Linux and its tools, which are offered through different distributions (Red Hat, Debian, Ubuntu, etc.), the complexity of deployment of OpenStack is also a major challenge and different companies and solutions propose to simplify this process, an installation but also a maintenance issue to ensure operational aspects and simplify migration processes to new versions.
At the beginning of 2016, two solutions are fiercely fighting in this area: Mirantis with its OpenStack deployment solution named “Fuel” and Red Hat providing a solution based on the OpenStack TripleO project incubated for Red Hat through the RDO project. It was noted that beyond their rivalry, Red Hat and Mirantis are two companies that contribute very actively to the OpenStack project.
In the end and for our own needs, we clearly chose RDO (or the company version Red Hat OpenStack Platform) because even if the installation process seemed at that time less successful in terms of functionalities than the solution proposed by Mirantis, the basic project was 100% open source and the efforts made by Red Hat were for us a real guarantee of sustainability of the solution. We have been a Red Hat partner for many years and know the value of the solutions provided by this company.
Here are the different components of our OpenStack cluster that are explained in the remainder of the document.
- the Horizon dashboard which allows you to manage all the services through an intuitive and very complete Web interface
- instead of local accounts, we have connected all OpenStack services to our existing Identity Management system, which is based on the FreeIPA open source project, a project distributed under its paid version under the name Red Hat Identity Management
- on the network side, we use Neutron with Open vSwitch and have also activated the service LBaaS v2 (Load-Balancing-as-a-Service). The management of the network of the various tenants is ensured by VXLAN (virtual extensible LAN) also provided by the Neutron service
- network storage volumes are delivered through a Ceph cluster fully integrated with the OpenStack TripleO deployment. These volumes are additional storage spaces that must be attached to the instances
- the instances themselves are managed by the Nova service which is configured with the hosts aggregates mechanism in order to be able to provide different generations of instances. In our case, we propose two generations of instances: those instantiated on our old generation of servers (disk to tray) and those created on the new generation of servers that have solid-state drive (SSD).
- the Glance service uses Ceph to store images of the different distributions proposed for installing new virtual servers
- object storage (equivalent to Amazon AWS S3) is also provided by our Ceph cluster interfaced via the Swift service
- we don’t use the Telemetry service but have preferred a solution based on CollectD and Prometheus software, tools that we already master and use in other contexts.
- the orchestration of the OpenStack cluster consists of a combination of tools (Mistral, Heat, Ansible, Puppet) that allows to automate all the administration tasks. The creation of resources in the cluster itself is fully automated with Terraform software
At the network level, here is what has been put in place:
Some details in relation to the above diagram:
- each with its own private subnetwork and virtual router, these tenants are completely isolated from each other and have their own resource quota
- we have a single pool of public IP addresses that are shared between the different supporters. An instance that is to be exposed directly on the Internet must assign a pool IP to itself. For access to instances with only a private address, we have set up SSH bastions for direct access
All these services are naturally exposed and can be used in self-service via APIs, resource management can be done via the web interface, on the command line or via Terraform. The actual monitoring is carried out through performance or availability indicators, which are tracked by CollectD and Prometheus, and alerting via the Prometheus Alertmanager. For troubleshooting, we rely on the different logs produced by the various OpenStack services and consolidated in our existing ELK cluster.
The implementation of this new OpenStack cluster enabled us to measure the numerous evolutions of the various services integrated in this project and also the significant improvements in terms of updating. In our opinion, it is undoubtedly the most advanced open source solution for the deployment of an on-premises private cloud. The features provided are very close to the core services of the main public cloud (Amazon AWS and Microsoft Azure) and the development of the solution is guaranteed by a very large and dynamic community of developers. Numerous prestigious companies support this project through the OpenStack Foundation.
Having our own private cloud offers us great flexibility for hosting our internal services or developments. It also allows us to maintain a very high level of competence compared to OpenStack, without overlooking the cost control for services with known dimensions. Through our various work related to the implementation and management of our own OpenStack cluster, we have also had the opportunity to contribute to various aspects related to our needs, in particular with regards to the TripleO deployment system.
Automating resource creation with Terraform facilitates the implemention of hybrid cloud architectures to blend or move resources between private and public clouds as needed. This flexibility allows you to distribute the different services of a project across different cloud solutions according to multiple criteria (costs, localization, service availability, business guidelines, capacity planning, etc.).
We are currently working on the deployment of our own OpenShift cluster deployed on our new OpenStack cluster. Having our own IaaS (OpenStack) infrastructure and PaaS (OpenShift) will allow us to meet most of our internal needs, whether they are based on deployments directly at the level of the virtualization layer or through an additional abstraction layer based on containers (microservices architecture).