Probably everyone with OpenStack hands-on experience would agree that sometimes it could be hard and frustrating to install it and test it. Especially if you do not own a small Data Center or the proper physical infrastructure to support its installation needs… However, wouldn’t it be great to use a public cloud provider and its infrastructure to experiment and create POC environments? In this article, we will provide the basic steps of how to spin up an OpenStack POC environment on AWS and the things you should modify in order to get OpenStack up and running probably on every cloud provider. So, where to start? The very beginning of course!
OpenStack is an IaaS solution built up using several open-source projects and each project is being developed independently. That means the projects can be used either within or outside OpenStack’s context (Swift project which provides object storage could be a good use case). End-users have the ability to manage – in a cloud-fashion – the resources OpenStack exposes. Understanding how OpenStack works in the background (just some basic concepts) would greatly help us during the installation process. OpenStack is a distributed system and being such, it has a Control and Data Plane. Quoting from OpenStack’s architecture requirements page.
When designing an OpenStack cloud, it is important to consider the needs dictated by the Service Level Agreement (SLA). This includes the core services required to maintain availability of running Compute service instances, networks, storage, and additional services running on top of those resources. These services are often referred to as the Data Plane services and are generally expected to be available all the time.
The remaining services, responsible for creating, read, update and delete (CRUD) operations, metering, monitoring, and so on, are often referred to as the Control Plane. The SLA is likely to dictate a lower uptime requirement for these services.
There are many ways and methods to install and maintain OpenStack. However, the so-called minimal deployment always complies of the following services:
Those are the bare minimum services we need in order to get OpenStack up and running. Let us now have a look at what each service/project does and try to understand why we need it.
OpenStack is a heavy user of APIs. Each service has its own API endpoint and it needs to communicate with other services (and APIs) to function properly. Hence, we need a method for each service/user to authenticate and authorize itself against all the other OpenStack services. Latter is being realized with Keystone – one of the most important services in OpenStack – which also provides RBAC capabilities to our environment. Quoting from Keystone’s project page:
Keystone is an OpenStack service that provides API client authentication, service discovery, and distributed multi-tenant authorization by implementing OpenStack’s Identity API.
That means each attempt from a service or user to communicate with another OpenStack service should first be communicated with Keystone. It is Keystone’s responsibility to handle any AAA (Authentication, Authorization, Accounting) attempt against any OpenStack resources.
Where all this information about the available APIs and services is being kept? This is where a database comes into play (usually a MariaDB database is being deployed). Each service has its own table and each time we’re creating an endpoint we’re actually creating a new database entry. Creating a database table is a prerequisite step for each service we want to install.
e.g. The first step for installing Keystone would be (from OpenStack’s installation page):
Before you install and configure the Identity service, you must create a database.
Replace KEYSTONE_DBPASS with a suitable password.
… and usually, the next step is to create the service user and the endpoints, e.g. for nova
OpenStack endpoint create –region RegionOne \
compute public http://controller:8774/v2.1
It is worth mentioning here that there are three different endpoint types in OpenStack:
Public: We can use public endpoints to give external users access to our services/resources.
Admin: Administrators can use this endpoint to manage OpenStack’s infrastructure and services.
Internal: Services use that endpoint for internal communication.
Glance is OpenStack’s image service and it is the service we’re using for making images available to our end users. Each time we want to make an image available we just need to upload it to Glance.
Glance image services include discovering, registering, and retrieving virtual machine images. Glance has a RESTful API that allows querying of VM image metadata as well as retrieval of the actual image. VM images made available through Glance can be stored in a variety of locations from simple filesystems to object-storage systems like the OpenStack Swift project.
Nova is the service which abstracts a server’s underlying resources and allows us to use its compute resources – including bare metal, virtual machines, and containers.
OpenStack’s networking is based on Neutron which is a Software-Defined Networking service and usually one of the most complex services to setup. The complexity lies in the fact that OpenStack’s networking highly depends on the existing physical network installation – and networking itself can sometimes be quite complex. When it comes to deploy networking, we could either use:
Provider Networks: No-virtual networks. VMs are directly connected to the underlying physical network. They are getting IPs from the existing external network infrastructure.
Self-service Networks: Provide the ability to create overlay virtual networks using e.g. VXLAN tunnels. We need to attach floating IPs to our VMs and services in order to access them from the public world.
An excellent article about OpenStack’s different networking options can be found here.
Let’s see what kind of resources we will be needing in order to get started with the installation:
VPC – Networking
1 VPC with 2 subnets
Subnet A: 172.31.32.0/24
Subnet B: 22.214.171.124/24
1 reserved floating IP
We will be needing to access OpenStack Dashboard and APIs so make sure you reserve a floating IP. We will use that IP during the public API endpoints setup.
Since this is a testing environment make sure to wide open the permissions on your ACL so that any kind of network traffic between ec2 instances is allowed.
Double-check your Routing Table and make sure that SubnetB has access to the internet. We’re trying to emulate a production environment so it shouldn’t be necessary for Subnet A to have access to the internet.
For the minimal deployment we need to create the following resources:
1 x Controller Node:
(Controller hosts most of the core OpenStack services)
Network Interfaces: 3
eth0: Management & Overlay Network – Subnet A
eth1: Provider Network – Subnet B (OpenStack has a special configuration for this interface and it does not use an IP → https://docs.openstack.org/install-guide/environment-networking-controller.html)
eth2: Public internet access – Subnet B (We will be using this interface to route traffic from eth1 and provide access to the Internet)
1 x Compute Node:
Network Interfaces: 2
1 x Block Storage Node:
Disk1 (Root): 8Gb
2 x Object Storage Nodes:
We will be installing OpenStack Rocky – https://docs.openstack.org/rocky/install/ but the guide should work with all other OpenStack releases.
Centos7 – https://docs.openstack.org/install-guide/environment-packages-rdo.html
But feel free to choose the Operating System you prefer. You will be able to find instructions in the OpenStack documentation for the following OSes:
Self-Service Networks with Linux-bridges : https://docs.openstack.org/neutron/rocky/install/overview.html#network2
Things to consider:
Virtualization: EC2 instances are VMs so they are already running on top of a virtualized layer. The bad news is that as of now AWS is not yet supporting nested virtualization. Hence, we will be using QEMU instead of KVM in order to fully emulate a system. It is going to be slower but absolutely fine for POC scenarios. Otherwise, if your pocket can handle it you might want to experiment with AWS Bare Metal instances and have a hypervisor of your choice installed!
Networking: Installing OpenStack requires access to the networking equipment in order to do some modifications/optimizations. However, we will now be using AWS network which by default has some limitations and is not allowing traffic from random VMs and IPs flowing within its network. If this was possible we could start questioning the security levels of this service, but we can’t. Therefore, we will have to apply some workarounds. This is why a separate OpenStack Networking on AWS section exists on this article.
For the OpenStack POC installation on AWS we will be following the steps described in the following guide:
OpenStack is a community-backed open source project. i.e. documentation is not always the best way to install services since it is sometimes outdated. Always double-check that the APIs and the endpoints used are the APIs you have previously installed.
e.g. if you installed nova API v2.1 make sure you use that version and not v1. Even if mentioned otherwise in the official documentation!
OpenStack Networking on AWS
The management network is using eth0 (green one) ifaces and it is the network OpenStack uses for its internal communication needs. OpenStack services use IPs from this subnet to talk to each other. (On a real-world scenario Internet access is not mandatory for this network). The same network is used to build up the overlay networks. We have several options on which encapsulation technology to use in order to build up the overlay network. In our scenario we used VXLAN. Each time we are creating a new self-service (virtualized internal) network a new VXLAN is being created on the network node. We will not see any connectivity issues among machines that belong to the same overlay network since the traffic is encapsulated and AWS sees what it need to see (a known MAC and IP address). However, our instances will not be able to access the external world. Let’s try to understand why by having a look at the picture below:
The self-service network is an overlay network (uses eth0) and all traffic generated within that network uses a VXLAN tunnel. However, we are being blocked when an instance tries to communicate with the external world (provider network here). This happens because the network traffic needs to pass from:
Here we need to remember that OpenStack has a special configuration for this interface and it does not use an IP → https://docs.openstack.org/install-guide/environment-networking-controller.html).
Let’s have a look on the vrouter’s routing table:
It is clear that if we want to communicate with the external world, we need to send our traffic through the subnet’s gateway. However, this is an AWS managed gateway and it is expecting a specific IP and MAC address. Since our packets have an IP (external gateway IP) different than the one AWS expects our traffic is getting blocked. Another indicator that something is wrong is the ARP table. An arp -n within the vrouter namespace reveals that the vrouter cannot detect the MAC address of the provider network (AWS network) gateway. You will see “Incomplete” against the subnet’s gateway entry in the ARP table. What we need to do is change the vrouter’s MAC and IP to the values AWS subnet expects and assign the eth1 MAC address to a random one.
The problem is well described here as well. However, the suggested solution did not work for me out of the box. Changing the vrouter’s MAC address was not possible through the CLI (got an already in use error) so I had to directly modify Neutron’s SQL database. Moreover, changing the default MAC address of the NIC on AWS leads to connectivity issues and more modifications were required in order to make this work. You could try this out yourself. Spin up a VM with one NIC and ssh to it. Now change its default MAC address and try connecting to it. You are locked out 🙂.
Generally, what we’re trying to do here is to solve 2 problems. The first one is Layer 2 connectivity and the second one is Layer 3 connectivity.
We’re using an AWS LAN. Hence, we need to make sure that ARP works fine. This is the reason for modifying the vrouter’s external gateway port MAC address.
We need to ensure that our traffic is being routed properly in the network and can bypass AWS restrictions. This is why we added an additional NIC to the controller (eth2). We will use that interface to route all traffic from eth1. However, since AWS network expects to see packets with a specific MAC:IP we need to NAT eth2 traffic flowing towards AWS. Following command should do the trick:
iptables -t nat -A POSTROUTING -o eth2 -j MASQUERADE
We are now somehow familiar with how packets flow within OpenStack and AWS network as well as which problems we’re trying to tackle. Hence, let’s describe the steps we need to follow in order to fix OpenStack’s networking on AWS (and any other cloud environment which has similar networking limitations).
Follow the steps below:
We’re now ready! Our internal instances should be able to reach the internet! Have fun playing with OpenStack!
We work with our clients to de-risk and accelerate their business goals realisation. Our approach is based on tailoring our services to fit your needs leveraging our portfolio of strategy, execution, innovation and service delivery offerings to help you reach your objectives
We’re always on the lookout for exceptional talent and people who share our values. Even as we continue to grow, we maintain a family environment with respect and teamwork core to our culture.
Piotr Grześkowiak has been at Automation Logic for just over five years, starting out in our DevOps Academy after graduating in Computer Science with Information Security. During those 5 years, he’s gone from an engineer in training to a well respected senior engineer, trusted by the whole company. Piotr’s been on three Central Government client […]
We interviewed AL’s co-founders Kris & Norm about their journey building Automation Logic into the business it is over the last 12 years. From the values they’ve set in place, to the struggles they’ve faced. And obviously, because it’s pretty difficult not to mention it these days, the impact covid had.
A memoir of a Workload Migration engineer by Liam Rae-McLauchlan We’ve all read the blogs and articles about migrating to the Cloud and its benefits, but in practice it can be a daunting task. Maybe your organisation is planning to move to the cloud, or is already trying – up to 85% of enterprises are […]