When looking at the network level of infrastructure security, it is important to distinguish
between public clouds and private clouds2. With private clouds, there
are no new attacks, vulnerabilities, or changes in risk specific
to this topology that information security personnel need to
consider. Although your organization’s IT architecture may change with the
implementation of a private cloud, your current network topology will
probably not change significantly. If you have a private extranet in place
(e.g., for premium customers or strategic partners), for practical
purposes you probably have the network topology for a private cloud in
place already. The security considerations you have today apply to a
private cloud infrastructure, too. And the security tools you have in
place (or should have in place) are also necessary for a private cloud and
operate in the same way. Figure 1 shows the
topological similarities between a secure extranet and a private
cloud.However, if you choose to use public cloud services, changing security requirements will
require changes to your network topology. You must address how your
existing network topology interacts with your cloud provider’s network
topology. There are four significant risk factors in this use case:
Ensuring the confidentiality and integrity of your
organization’s data-in-transit to and from your public cloud
provider
Ensuring proper access control (authentication, authorization,
and auditing) to whatever resources you are using at your public cloud
provider
Ensuring the availability of the Internet-facing resources in a
public cloud that are being used by your organization, or have been
assigned to your organization by your public cloud providers
Replacing the established model of network zones and tiers with
domains
We will discuss each of these risk factors in the sections that
follow.
1. Ensuring Data Confidentiality and Integrity
Some resources and data previously confined to a private network are now
exposed to the Internet, and to a shared public network belonging to a
third-party cloud provider.
An example of problems associated with this first risk factor is
an Amazon Web Services (AWS) security vulnerability reported
in December 2008. In a blog post, the author detailed a flaw in the digital
signature algorithm used when “... making Query (aka REST) requests to
Amazon SimpleDB, to Amazon Elastic Compute Cloud (EC2), or to Amazon
Simple Queue Service (SQS) over HTTP.” Although use of HTTPS (instead of
HTTP) would have mitigated the integrity risk, users not using HTTPS
(but using HTTP) did face an increased risk that their data could have
been altered in transit without their knowledge.
2. Ensuring Proper Access Control
Since some subset of these resources (or maybe even all of them) is
now exposed to the Internet, an organization using a public cloud faces
a significant increase in risk to its data. The ability to audit the
operations of your cloud provider’s network (let alone to conduct any
real-time monitoring, such as on your own network), even after the fact,
is probably non-existent. You will have decreased access to relevant
network-level logs and data, and a limited ability to thoroughly conduct
investigations and gather forensic data.
An example of the problems associated with this second risk factor
is the issue of reused (reassigned) IP addresses. Generally speaking, cloud providers do not
sufficiently “age” IP addresses when they are no longer needed for one
customer. Addresses are usually reassigned and reused by other customers
as they become available. From a cloud provider’s perspective this makes
sense. IP addresses are a finite quantity and a billable asset. However,
from a customer’s security perspective, the persistence of IP addresses
that are no longer in use can present a problem. A customer can’t assume
that network access to its resources is terminated upon release of its
IP address. There is necessarily a lag time between the change of an IP
address in DNS and the clearing of that address in DNS caches. There
is a similar lag time between when physical (i.e., MAC) addresses are
changed in ARP tables and when old ARP addresses are cleared from cache;
an old address persists in ARP caches until they are cleared. This means
that even though addresses might have been changed, the (now) old
addresses are still available in cache, and therefore they still allow
users to reach these supposedly non-existent resources. Recently, there
were many reports of problems with “non-aged” IP addresses at one of the
largest cloud providers; this was likely an impetus for an AWS
announcement of the Amazon Elastic IP capabilities in March
2008. (With Elastic IP addresses, customers are given a block of five
routable IP addresses over which they control assignment.) Additionally,
according to Simson Garfinkel:
A separate ongoing problem with the load balancers causes them
to terminate any TCP/IP connection that contains more than
231 bytes. This means that objects larger
than 2GB must be stored to S3 in several individual transactions, with
each of those transactions referring to different byte ranges of the
same object.
However, the issue of “non-aged” IP addresses and unauthorized
network access to resources does not apply only to routable IP addresses
(i.e., resources intended to be reachable directly from the Internet).
The issue also applies to cloud providers’ internal networks for
customer use and the assignment of non-routable IP addresses. Although your resources may not be directly reachable from
the Internet, for management purposes your resources must be accessible
within the cloud provider’s network via private addressing. (Every
public/Internet-facing resource also has a private address.) Other
customers of your cloud provider may not be well intentioned and might
be able to reach your resources internally via the cloud provider’s
networks. As reported in The Washington Post,
AWS has had problems with abuses of its resources
affecting the public and other customers.
Some products emerging onto the market will help alleviate the problem of IP address reuse, but
unless cloud providers offer these products as managed services,
customers are paying for yet another third-party product to solve a
problem that their cloud provider’s practices created for them.
3. Ensuring the Availability of Internet-Facing Resources
Reliance on network security has increased because an increased amount
of data or an increased number of organizational personnel now depend on
externally hosted devices to ensure the availability of cloud-provided
resources. Consequently, the three risk factors enumerated in the
preceding section must be acceptable to your organization.
BGP prefix hijacking (i.e., the falsification of Network Layer
Reachability Information) provides a good example of this third risk factor.
Prefix hijacking involves announcing an autonomous system address space that belongs to someone else without her
permission. Such announcements often occur because of a configuration
mistake, but that misconfiguration may still affect the availability of
your cloud-based resources. According to a study presented to the North
American Network Operators Group (NANOG) in February 2006, several hundred such misconfigurations
occur per month. Probably the best known example of such a misconfiguration
mistake occurred in February 2008 when Pakistan Telecom made an error by
announcing a dummy route for YouTube to its own telecommunications
partner, PCCW, based in Hong Kong. The intent was to block YouTube within Pakistan because of some supposedly
blasphemous videos hosted on the site. The result was that YouTube was
globally unavailable for two hours.
In addition to misconfigurations, there are deliberate attacks as
well. Although prefix hijacking due to deliberate attacks is far less
common than misconfigurations, it still occurs and can block access to
data. According to the same study presented to NANOG, attacks occur
fewer than 100 times per month. Although prefix hijackings are not new, that attack figure will
certainly rise, and probably significantly, along with a rise in cloud
computing. As the use of cloud computing increases, the availability of
cloud-based resources increases in value to customers. That increased
value to customers translates to an increased risk of malicious activity
to threaten that availability.
DNS attacks are another example of problems associated with this third
risk factor. In fact, there are several forms of DNS attacks to worry
about with regard to cloud computing. Although DNS attacks are not new
and are not directly related to the use of cloud computing, the issue
with DNS and cloud computing is an increase in an
organization’s risk at the network level because of increased external
DNS querying (reducing the effectiveness of “split horizon” DNS
configurations) along with some increased number of organizational
personnel being more dependent on network security to ensure the
availability of cloud-provided resources being used.
Although the “Kaminsky Bug” (CVE-2008-1447, “DNS Insufficient Socket Entropy
Vulnerability”) garnered most of the network security attention in 2008,
other DNS problems impact cloud computing as well. Not only are there
vulnerabilities in the DNS protocol and in implementations of
DNS, but also there are fairly widespread DNS cache poisoning attacks whereby a DNS server is
tricked into accepting incorrect information. Although many people
thought DNS cache poisoning attacks had been quashed several years ago,
that is not true, and these attacks are still very much a
problem—especially in the context of cloud computing. Variants of this
basic cache poisoning attack include redirecting the target domain’s
name server (NS), redirecting the NS record to another target domain,
and responding before the real NS (called
DNS forgery).
A final example of problems associated with this third risk factor
is denial of service (DoS) and distributed denial of service (DDoS) attacks. Again,
although DoS/DDoS attacks are not new and are not directly related to
the use of cloud computing, the issue with these attacks and cloud
computing is an increase in an organization’s risk at the network level
because of some increased use of resources external to your
organization’s network. For example, there continue to be rumors of
continued DDoS attacks on AWS, making the services unavailable for hours at a time
to AWS users. (Amazon has not acknowledged that service interruptions
are in fact due to DDoS attacks.)
However, when using IaaS, the risk of a DDoS attack is not only
external (i.e., Internet-facing). There is also the risk of an internal
DDoS attack through the portion of the IaaS provider’s network used by customers (separate from
the IaaS provider’s corporate network). That internal (non-routable)
network is a shared resource, used by customers for access to their
non-public instances (e.g., Amazon Machine Images or AMIs) as well as by the provider
for management of its network and resources (such as physical servers).
If I were a rogue customer, there would be nothing to prevent me from
using my customer access to this internal network to find and attack
other customers, or the IaaS provider’s infrastructure—and the provider
would probably not have any detective controls in place to even notify
it of such an attack. The only preventive controls other customers would
have would be how hardened their instances (e.g., AMIs) are, and whether
they are taking advantage of a provider’s capabilities to firewall off
groups of instances (e.g., AWS).
4. Replacing the Established Model of Network Zones and Tiers with
Domains
The established isolation model of network zones and tiers no longer
exists in the public IaaS and PaaS clouds. For years, network security
has relied on zones, such as intranet versus extranet and development
versus production, to segregate network traffic for improved security.
This model was based on exclusion—only individuals and systems in
specific roles have access to specific zones. Similarly, systems within
a specific tier often have only specific access within or across a
specific tier. For example, systems within a presentation tier are not
allowed to communicate directly with systems in the database tier, but
can communicate only with an authorized system within the application
zone. SaaS clouds built on public IaaS or PaaS clouds have similar characteristics. However, a
public SaaS built on a private IaaS (e.g., Salesforce.com) may follow
the traditional isolation model, but that topology information is not
typically shared with customers.
The traditional model of network zones and tiers has been replaced
in public cloud computing with “security groups,” “security domains,” or
“virtual data centers” that have logical separation between tiers
but are less precise and afford less protection than the formerly
established model. For example, the security groups feature in AWS
allows your virtual machines (VMs) to access each other using a
virtual firewall that has the ability to filter traffic based on
IP address (a specific address or a subnet), packet types
(TCP, UDP, or ICMP), and ports (or a range of ports). Domain names are
used in various networking contexts and application-specific naming and
addressing purposes, based on DNS. For example, Google’s App Engine provides a logical grouping of applications
based on domain names such as
mytestapp.test.mydomain.com and
myprodapp.prod.mydomain.com.
In the established model of network zones and tiers, not only were
development systems logically separated from production systems at the
network level, but these two groups of systems were also physically
separated at the host level (i.e., they ran on physically separated
servers in logically separated network zones). With cloud computing,
however, this separation no longer exists. The cloud computing model of
separation by domains provides logical separation for addressing
purposes only. There is no longer any “required” physical separation, as
a test domain and a production domain may very well be on the same
physical server. Furthermore, the former logical network separation no
longer exists; logical separation now is at the host level with both
domains running on the same physical server and being separated only
logically by VM monitors (hypervisors).
5. Network-Level Mitigation
Given the factors discussed in the preceding sections, what can you do to
mitigate these increased risk factors? First, note that network-level
risks exist regardless of what aspects of “cloud computing” services are
being used (e.g., software-as-a-service, platform-as-a-service, or
infrastructure-as-a-service). The primary determination of risk level is
therefore not which *aaS is being used, but rather
whether your organization intends to use or is using a public, private,
or hybrid cloud. Although some IaaS clouds offer virtual network zoning,
they may not match an internal private cloud environment that performs
stateful inspection and other network security measures.
If your organization is large enough to afford the resources of a
private cloud, your risks will decrease—assuming you have
a true private cloud that is internal to your network. In some cases, a
private cloud located at a cloud provider’s facility can help meet your
security requirements but will depend on the provider capabilities and
maturity.
You can reduce your confidentiality risks by using encryption; specifically by using validated
implementations of cryptography for data-in-transit. Secure digital
signatures make it much more difficult, if not impossible, for someone
to tamper with your data, and this ensures data integrity.
Availability problems at the network level are far more difficult
to mitigate with cloud computing—unless your organization is using a
private cloud that is internal to your network topology. Even if your
private cloud is a private (i.e., non-shared) external network at a
cloud provider’s facility, you will face increased risk at the network
level. A public cloud faces even greater risk. But let’s keep some
perspective here—greater than what?
Even large enterprises with significant resources face
considerable challenges at the network level of infrastructure security.
Are the risks associated with cloud computing actually higher than the
risks enterprises are facing today? Consider existing private and public
extranets, and take into account partner connections when making such a
comparison. For large enterprises without significant resources, or for
small to medium-size businesses (SMBs), is the risk of using public clouds (assuming that
such enterprises lack the resources necessary for private clouds) really
higher than the risks inherent in their current infrastructures? In many
cases, the answer is probably no—there is not a
higher level of risk.
Table 1 lists
security controls at the network level.
Table 1. Security controls at the network level
Threat
outlook | Low (with the exception
of DoS attacks) |
---|
Preventive
controls | Network access control
supplied by provider (e.g., firewall), encryption of data in
transit (e.g., SSL, IPSec) |
Detective
controls | Provider-managed
aggregation of security event logs (security incident and event
management, or SIEM), network-based intrusion detection
system/intrusion prevention system (IDS/IPS) |