cloud computing and Data center

To many, cloud computing and WAN utilization can be considered synonymous. More organizations are seeing the direct benefits of moving towards a cloud platform. Whether they’re trying to reduce their internal hardware footprint, or are trying to distribute their data; the data center infrastructure is very much a critical component of the entire process.

Business are trying to find ways to better deliver their information and continue to be more competitive in this “cloud-centric” market. WAN technologies continue to evolve as more bandwidth, better underlying infrastructure and optimization all create a more robust connection. As more organizations move to the cloud, they will increase their reliance on data center availability and technologies. This is where data center managers must look to the cloud to help them stay ahead of the curve and continue to offer great technology solutions.

The market will continue to grow and some leading data center providers are now actively looking at ways to enter the cloud and WAN markets. Even now, more data centers are finding new delivery models and services which not only help customer agility, but also drive additional revenue.

  • Enhanced connectivity. Software-defined technologies have helped the data center become more distributed and robust. Multiple data centers across the country can span one logical network connection and deliver cloud services with very little latency. This type of advancements truly brings data centers closer together and creates numerous options for the customer. The ability to build a “business-in-a-box” becomes much easier since one single data center provider can effectively support the entire global organization. More bandwidth, better WAN optimization and more cloud platforms all translate into more opportunities for the data center provider community. With better links between sites, more data centers can open up their environments to a broader range of customers.
  • New cloud services. Expansive data center resources mean new cloud services, whether data center providers are acting as hosting solutions for various cloud options or  building entirely new cloud platforms for private, public or even community access. Right now, data center cloud architecture is on the rise. Just take a look at the recent cloud architect job trends. There are a lot of cloud options for data centers to dive into. Backup and recovery have been popular, as have been cloud offerings. Data centers are now trying to ramp up their WAN presence by increasing bandwidth and improving the underlying hardware.  Cloud hosting, deployment and implementation services are definitely ways that data center environments can continue to help the industry grow – and generate additional revenue in the process.
  • Better DR and recoverability. The increase in bandwidth and WAN control has given rise to cloud-based disaster recovery (DR). Data center providers can effectively, remove single points of failure by quickly replicating data over the WAN to hot or standby sites. Many organizations, especially in light of recent natural disasters, are looking heavily into business continuity. WAN optimization and faster links into the data center have created feasible and affordable disaster recovery methodologies for more organizations. DR and business continuity operations aren’t cheap. However, they have come down in price as the demand has gone up and the competition has increased. Now, organizations have the option of replicating vital VM or server information into the cloud. Then, in the event of an emergency – the data center provider can spin up VMs to allow an organization to connect into the data center. Using WAN optimization and calculate bandwidth utilization, users can continue to work even though their primary data center is down.
  • Greater data center elasticity. With improvements in the WAN and cloud models, data centers have been able to focus on growth and expansion. Now providers are able to open more branch data centers to support more users. Internet and WAN speeds have increased as ISPs and providers try to keep up with demand. IT consumerization has created a “data-on-demand” end-user who requires connectivity and availability at almost all times. Not only can a data center utilize greater bandwidth resources, they can support more users and control the connections more granularly. All of this results in an expanding business model where data center providers can grow and provision new customer resources quickly. For example, a data center provider can offer burst WAN speeds during peak times – a travel agency for instance – so that the company can meet the demands of the visiting customer. Then once the peak times are over, the provider can dynamically de-provision those resources and return them to the pool. This creates a more robust and more elastic data center capable of supporting more users at any given time.

Cloud computing will continue to provide great benefits for those organizations which fit the cloud model. At the heart of it all sits the data center which provides connectivity and data delivery for many cloud components. Now, more than ever, data center administrators must look to WAN technologies to help not only optimize their offerings, but to increase the benefits of working with a WAN-based, cloud ready environment.

Cisco Cloud Data Center Building Blocks
Cisco brings an important set of technology building blocks at the foundation of the cloud architectures. They are
described as follows.
10 Gigabit Ethernet
A cloud data center is designed with the high density of virtual machines coupled with a high processor core count. From a networking perspective, the increase in virtual machine and processor core density promotes a transition to 10 Gigabit Ethernet as the required mechanism for attaching servers. Multiple virtual machines on a single server can quickly overwhelm a single Gigabit Ethernet link, and multiple Gigabit Ethernet links can increase costs. Moreover, there needs to be a strategy in place to not only take the existing investment in 1 Gigabit Ethernet and seamlessly integrate it into a 10 Gigabit infrastructure, but to also enable migration to 10 Gigabit Ethernet and unified fabric (described next). Interestingly, this adoption necessitates virtual machine–aware networking. Cisco is bringing new terminology and implementation to the industry in this regards, called VN-Link. VN-Link is the virtual link between the virtual machine and the physical interface of the physical server. This implementation will enable operational consistency down to the individual virtual machine as well as policy portability, so network and security policy follows virtual machines as they move around the data center. Cisco VN-Link helps enable new capabilities and features, simplifies management and operations, and allows scaling for server virtualization solutions. Specific benefits include:
● Real-time policy-based configuration
● Mobile security and network policy, moving policy with the virtual machine during virtual machine mobility, and live migration for persistent network, security, and storage compliance
● Nondisruptive management model, aligning management and operations environments for virtual machines and physical server connectivity in the data center
Unified Fabric
If one studies a typical data center server infrastructure, it is easy to notice that servers have a series of network interfaces connected to multiple types of networks (LAN, SAN, IPC). This arrangement adds complexity in the form of cost, cabling, port count, scalability, power, and cooling. If we follow the same tradition in a cloud data center, this architecture will not scale to the density that is typically expected. Hence, to continue to reduce the total cost of ownership (TCO) and to deploy virtual machines, all servers must have a consistent and ubiquitous set of network and storage capabilities. One of the simplest and most efficient ways to deliver these capabilities is to deploy a unified fabric. The shift to a unified fabric gives all servers (physical and virtual) access to the LAN, SAN, and IPC networks, allowing more to be consolidated in the customer’s network for greater efficiency and costs savings.
Cisco is offering not only 10 Gigabit Ethernet, but also lossless 10 Gigabit Ethernet, currently called Data Center Ethernet or Enhanced Ethernet. This becomes the foundation to consolidate fabrics like Fiber Channel (for SAN), which require the stringent lossless nature of a network. Fibre Channel over Ethernet (FCoE), which is a standard accepted by standard bodies and industry, is leading the way to unify fabric on a cloud data center. Hence, to consolidate server I/O, the server access layer must be adapted to support a unified fabric. Additionally, a new breed of adapters, called converged network adapters (CNAs), would be implemented in the server platform, which will act at the consolidation and virtualization point in the compute layer.
Unified Computing
The unified fabric now enables a fully virtualized cloud data center with pools of computing, network, and storage resources, through the Cisco Unified Computing System (UCS).
The Cisco UCS bridges the silos in the classic data center, enabling better utilization of infrastructure in a fully virtualized environment, and creates a unified architecture using industry-standard technologies that provide interoperability and investment protection. UCS unites computing, network, storage access, and virtualization resources into a scalable, modular design that is managed as a single energy-efficient system. This system is managed through an embedded management framework, in the Cisco UCS platform.
The Cisco UCS management framework provides robust API for managing all system configuration and operation. Its also helps increase cloud data center staff productivity, enabling better management of storage, networking, computing, and applications to collaborate on defining service profiles for applications. Service profiles help automate provisioning, allowing cloud data center to provision applications in minutes instead of days. This provides a means to stateless computing, where compute nodes have no inherent state pertaining to the application that it might execute.
So, at any given time, a machine could be running operating systems X, and then the next minute, it could be rebooted and it could be running a Hypervisor Y. Hence, the compute node is just an execution engine with CPU, memory, disk, flash, or hard drive. The core concept of a stateless computing model is to separate the access to the application from the execution of the application. Stateless computing provides a holistic way to address configuration management, rapid provisioning, upgrades/downgrades, scalability, policy enforcement, and auditing.
Cisco UCS provides support for a unified fabric over a low-latency, lossless, 10-Gbps Ethernet foundation. This network foundation consolidates today’s separate networks: LANs, SANs, and high-performance computing networks. Network consolidation lowers costs by reducing the number of network adapters, switches, and cables and thus decreasing power and cooling requirements. Cisco UCS also allows consolidated access to both SANs and network attached storage (NAS). With its unified fabric, the Cisco UCS can access storage over Ethernet, Fibre Channel, FCoE, and iSCSI, providing enterprises with choices and investment protection. In addition, storage access policies can be preassigned for system connectivity to storage resources, simplifying storage connectivity and management. The new Cisco UCS platform is, based on the Intel Xeon processor families, offer patented extended memory technology to support applications with large data sets and allow significantly more virtual machines per server, a key requirement for Cloud Data Center and Applications. Cisco UCS network adapters include adapters optimized for virtualization, compatibility with existing driver stacks, and efficient, high-performance Ethernet. With integrated management and “wire-once” unified fabric with the industry-standard computing platform, the Cisco UCS optimizes virtualization, provides dynamic resource provisioning for increased agility, and reduces total overall data center costs, in CapEx and OpEx.
Offering a new style of dynamic IT, Cisco UCS extends virtualized data centers and creates a foundation for private clouds that federate with compatible virtual private clouds. With the virtualized environment defined by a dynamic, scalable data center fabric, a workload really can run anywhere; the resources needed to support a workload can come even from an outside service provider in a cloud-computing model.next-generation cloud data center. It is based on Cisco’s and the ecosystems partners’ data center building blocks. The above technology architecture represents only a sample of
building blocks of a cloud data center. Moreover, the end-state technical architecture would not only contain the components listed above and below but would also be governed by different types of service and regulation/compliance requirements.
Other key software components are
● Business applications for service orchestration
● Service delivery management applications for service discovery, mapping, and Compliance
● SLA metering, measurement, and billing application for accountability
● Web and business logic hosting applications such as databases, and application and web servers
Other key facilities components are
● Power and cooling components
● Data center physical construction components
● Racking and cabling components
Cisco partners with software application and data center facilities solution providers in the above segments to provide a comprehensive cloud data center solution

Trust in Cloud Data Center

Cisco also brings the following security- and trust-related considerations in the infrastructure models of cloud computing. Cisco believes that gaining the advantages of cloud computing in the enterprise begins with establishing a trusted approach to the cloud. Just as we trust a financial institution with our valuables and monetary assets, a similar level and attributes of trust need to be established in cloud architecture. Hence the definition of private and virtual private clouds is based on the trust domain in addition to physical presence domains. The network can uniquely address trust in private clouds. Trust in a cloud data center centers on several core concepts:
Security: Traditional issues around data and resource access control, encryption, and incident detection are factors here.
Control: The ability of the enterprise to directly manage how and where data and applications are deployed and used.
Compliance and service-level management (SLA): This concept refers to contracting and enforcement of service-level agreements between varieties of parties, and conformance with regulatory, legal, and general industry requirements.
Cisco adopted the above core concepts in their solutions and services for cloud computing.

Advertisements

Cisco: Cloud Will Soon Handle Most Data Center Workloads

There has been no shortage of predictions that the cloud will eventually lead to the demise, or diminished role, of corporate data centers. However, new industry data suggests that cloud computing will soon make data centers even busier places than ever.

Cisco just released its first Global Cloud Index, and it has a clear message: it doesn’t matter if you’re using applications from your own data center or from somewhere else, they will be delivered from the cloud. Organization’s data centers will be moving to the cloud in a big way.  And, by extension, chances are your own organization will be as much a cloud provider as one outside the firewall.  This has interesting ramifications for any organization that maintains its own data centers and for the people who run those data centers.

Source: 2011 Cisco Global Cloud Index

Cisco’s new index is built on a series of predictions on the growth of global data centers and cloud-based IP traffic. The vendor estimates that global data center traffic — cloud and non-cloud — will grow four-fold from 2010 to 2015 and reach 4.8 zettabytes annually by 2015 — growing at a rate of 402 exabytes per month by that time.

The cloud computing piece of it will grow 12-fold over the forecast period, and represent over one-third of all data center traffic by 2015. Further, Cisco predicts, more than 50% of all data center workloads will be processed in the cloud.

Global cloud traffic will grow at a compounded annual growth rate of 66%, reaching 1.6 zettabytes by the end of 2015. That’s a rate of 133 exabytes per month.

Cisco predicts that in two years, by 2014, the amount of workloads processed in cloud data centers (51%) will surpass those processed in traditional data centers (49%) for the first time.  A year after that, cloud computing will grow to 57% of data center workloads.

What does Cisco base these assumptions on?  The vendor points to increasing data center virtualization — the addition of a switch and service layer that abstracts the hardware and applications underneath — as a key indicator of cloud computing adoption.

This means it’s time for IT and business leaders need to recognize the changing relationship technology has with their organizations. IT no longer is a “department” somewhere in the building that runs the reports and keeps the lights on in the server room.  Rather, it is evolving into a broker of services that need to be provisioned, on demand, as the business changes and expands in current markets or moves into new markets. IT’s role is to help the business determine and identify whether the services the business needs at that moment are available from the organization’s own IT resource network, or if they need to be secured from outside networks.

It also opens the door for opportunities for organizations with data centers to extend their cloud services to a wider audience — partners, vendors, current customers, and beyond.

Significantly, in its conclusions, Cisco doesn’t make a big deal about the distinction between public cloud (“consumer”) data centers and corporate (“business”) data centers. The index is based on the assumption that most businesses and their end-users will be using the compute and applications resources of a mixture of on-premises and off premises service providers. Also, a lot of the growth of cloud within organizations can be attributed to new devices, and new end-user expectations:

“The growing number of end user devices combined with consumer and business users preference or need to stay connected is creating new network requirements. The evolution of cloud services is driven in large part by users’ expectations to access applications and content anytime, from anywhere, over any network and with any device. Cloud-based data centers can support more virtual machines and workloads per physical server than traditional data centers.”

Thanks to cloud and virtualization, data centers will be even busier places than they have been. The Cisco index estimates that 76% of all IP traffic will still be within the data center three years from now, barely changed from the current 77% level. That’s because there will continue to be a lot of messaging and data moving between applications, databases, backup sites, and processors via parallel processing. As the Cisco analysis put it:

“Traditionally, one server carried one workload. However, with increasing server computing capacity and virtualization, multiple workloads per physical server are common in cloud architectures. Cloud economics, including server cost, resiliency, scalability, and product lifespan, are promoting migration of workloads across servers, both inside the data center and across data centers (even centers in different geographic areas). Often an end user application can be supported by several workloads distributed across servers. This can generate multiple streams of traffic within and between data centers, in addition to traffic to and from the end user.”

10 things you should know about NoSQL databases

 The relational database model has prevailed for decades, but a new type of database — known as NoSQL — is gaining attention in the enterprise. Here’s an overview of its pros and cons.

For a quarter of a century, the relational database (RDBMS) has been the dominant model for database management. But, today, non-relational, “cloud,” or “NoSQL” databases are gaining mindshare as an alternative model for database management. In this article, we’ll look at the 10 key aspects of these non-relational NoSQL databases: the top five advantages and the top five challenges.

 

Five advantages of NoSQL

1: Elastic scaling

For years, database administrators have relied on scale up — buying bigger servers as database load increases — rather than scale out — distributing the database across multiple hosts as load increases. However, as transaction rates and availability requirements increase, and as databases move into the cloud or onto virtualized environments, the economic advantages of scaling out on commodity hardware become irresistible.

RDBMS might not scale out easily on commodity clusters, but the new breed of NoSQL databases are designed to expand transparently to take advantage of new nodes, and they’re usually designed with low-cost commodity hardware in mind.

2: Big data

Just as transaction rates have grown out of recognition over the last decade, the volumes of data that are being stored also have increased massively. O’Reilly has cleverly called this the “industrial revolution of data.” RDBMS capacity has been growing to match these increases, but as with transaction rates, the constraints of data volumes that can be practically managed by a single RDBMS are becoming intolerable for some enterprises. Today, the volumes of “big data” that can be handled by NoSQL systems, such as Hadoop, outstrip what can be handled by the biggest RDBMS.

3: Goodbye DBAs (see you later?)

Despite the many manageability improvements claimed by RDBMS vendors over the years, high-end RDBMS systems can be maintained only with the assistance of expensive, highly trained DBAs. DBAs are intimately involved in the design, installation, and ongoing tuning of high-end RDBMS systems.

NoSQL databases are generally designed from the ground up to require less management:  automatic repair, data distribution, and simpler data models lead to lower administration and tuning requirements — in theory. In practice, it’s likely that rumors of the DBA’s death have been slightly exaggerated. Someone will always be accountable for the performance and availability of any mission-critical data store.

4: Economics

NoSQL databases typically use clusters of cheap commodity servers to manage the exploding data and transaction volumes, while RDBMS tends to rely on expensive proprietary servers and storage systems. The result is that the cost per gigabyte or transaction/second for NoSQL can be many times less than the cost for RDBMS, allowing you to store and process more data at a much lower price point.

5: Flexible data models

Change management is a big headache for large production RDBMS. Even minor changes to the data model of an RDBMS have to be carefully managed and may necessitate downtime or reduced service levels.

NoSQL databases have far more relaxed — or even nonexistent — data model restrictions. NoSQL Key Value stores and document databases allow the application to store virtually any structure it wants in a data element. Even the more rigidly defined BigTable-based NoSQL databases (Cassandra, HBase) typically allow new columns to be created without too much fuss.

The result is that application changes and database schema changes do not have to be managed as one complicated change unit. In theory, this will allow applications to iterate faster, though,clearly, there can be undesirable side effects if the application fails to manage data integrity.

Five challenges of NoSQL

The promise of the NoSQL database has generated a lot of enthusiasm, but there are many obstacles to overcome before they can appeal to mainstream enterprises. Here are a few of the top challenges.

1: Maturity

RDBMS systems have been around for a long time. NoSQL advocates will argue that their advancing age is a sign of their obsolescence, but for most CIOs, the maturity of the RDBMS is reassuring. For the most part, RDBMS systems are stable and richly functional. In comparison, most NoSQL alternatives are in pre-production versions with many key features yet to be implemented.

Living on the technological leading edge is an exciting prospect for many developers, but enterprises should approach it with extreme caution.

2: Support

Enterprises want the reassurance that if a key system fails, they will be able to get timely and competent support. All RDBMS vendors go to great lengths to provide a high level of enterprise support.

In contrast, most NoSQL systems are open source projects, and although there are usually one or more firms offering support for each NoSQL database, these companies often are small start-ups without the global reach, support resources, or credibility of an Oracle, Microsoft, or IBM.

3: Analytics and business intelligence

NoSQL databases have evolved to meet the scaling demands of modern Web 2.0 applications. Consequently, most of their feature set is oriented toward the demands of these applications. However, data in an application has value to the business that goes beyond the insert-read-update-delete cycle of a typical Web application. Businesses mine information in corporate databases to improve their efficiency and competitiveness, and business intelligence (BI) is a key IT issue for all medium to large companies.

NoSQL databases offer few facilities for ad-hoc query and analysis. Even a simple query requires significant programming expertise, and commonly used BI tools do not provide connectivity to NoSQL.

Some relief is provided by the emergence of solutions such as HIVE or PIG, which can provide easier access to data held in Hadoop clusters and perhaps eventually, other NoSQL databases. Quest Software has developed a product — Toad for Cloud Databases — that can provide ad-hoc query capabilities to a variety of NoSQL databases.

4: Administration

The design goals for NoSQL may be to provide a zero-admin solution, but the current reality falls well short of that goal. NoSQL today requires a lot of skill to install and a lot of effort to maintain.

5: Expertise

There are literally millions of developers throughout the world, and in every business segment, who are familiar with RDBMS concepts and programming. In contrast, almost every NoSQL developer is in a learning mode. This situation will address naturally over time, but for now, it’s far easier to find experienced RDBMS programmers or administrators than a NoSQL expert.

Conclusion

NoSQL databases are becoming an increasingly important part of the database landscape, and when used appropriately, can offer real benefits. However, enterprises should proceed with caution with full awareness of the legitimate limitations and issues that are associated with these databases.

Migrating from a relational to a NoSQL cloud database

The following post describes the approach one needs to take in migrating a standard relational database to an open source NoSQL cloud database.

The industry is on the cusp of tectonic changes in how and where data are stored and processed. For over 30 years, the venerable relational database management system (RDBMS), running in corporate data centers, has held the bulk of the world’s data. This cannot continue. RDBMS technology can no longer keep pace with the velocity, volume, and variety of data being created and consumed. For this new world of Big Data, NoSQL databases are required.

Migrating to these open-source cloud databases, however, requires some preparation for enterprise IT that grew up with RDBMS.

How Big Data is changing everything

There’s nothing wrong with the traditional RDBMS. It simply doesn’t fit the world we live in anymore. Mobile, social, cloud: these and other trends complicate the variety of data and dramatically increase the volume of data being stored in the enterprise.

As RedMonk analyst James Governor argues:

The database market is back in play after a 30-year old freeze in which Oracle dominated the high end, and Microsoft the midmarket. Then along came open source, the cloud, NoSQL, in memory and everything changed….The idea that everything is relational? Those days are gone.

This isn’t something that only concerns so-called web companies like Google and Foursquare. It’s equally relevant for “old school” organizations in the Finance, Healthcare, Government, Retail, and other vertical industries, and projected by Gartner to drive $28 billion in IT spending in 2012. As organizations grapple with their Big Data problems, when data grow beyond one server or start in a distributed fashion, they generally find themselves on the same road as the web companies: open-source, NoSQL databases.

While Big Data often gets associated with data analytics technologies like Hadoop and Storm, it’s actually much broader than this, and far more concerned with data storage than analytics. After all, if an enterprise can’t scale storage effectively, it will never have a “Big Data” problem to analyze. Hence, of the $30 billion global database market, only 25 percent is analytics, with the rest being OLTP or operational databases. Ironically, the recent rise of data analytics innovations like Hadoop stems from RDBMS failure to cope with Gartner’s three V’s of Big Data: high-volume, high-velocity, and high-variety of data.

Migrating from RDBMS to NoSQL

It’s clear that the database is critical to successfully managing the explosion of data. What’s less clear is how to transition from legacy RDBMS to modern NoSQL databases. Successfully migrating from a relational world to a NoSQL world requires careful planning.

In fact, one of the biggest dings against NoSQL databases like MongoDB or Neo4j is that they’re so easy to work with that developers end up jumping in headfirst, without bothering to properly construct their data model, thereby causing problems later. While NoSQL databases do provide significantly more developer agility and flexibility, they still  shouldn’t be used willy-nilly.

This is particularly true for those starting from an RDBMS background, as NoSQL differs markedly from relational. In the RDBMS world, an engineer designs the data schema from the outset, and SQL queries are then run against the database. If business/application changes then require changes to the database, a DBA must get involved. It’s not an easy process, as the DBA must navigate complex joins (i.e., inter-table relationships). NoSQL databases better fit modern application development, and provide significant database performance and developer agility benefits, albeit at the expense of some functionality.

Training

NoSQL databases are new enough that many database engineers will be RDBMS experts, but NoSQL neophytes. This shouldn’t deter developers hoping to use NoSQL in a new project. After all, most NoSQL databases are open source and come with built-in communities, happy to help new users get up to speed.

Part of this shift is one of nomenclature. For example, as relational database expert Chris Bird points out, the syntax in NoSQL Land differs greatly from SQL, and may require some mental gymnastics for new users.

According to Daniel Doubrovkine, Art.sy’s head of engineering, both NoSQL and RDBMS databases impose a learning curve on new users. The difference, he argues, is that NoSQL databases like MongoDB are simple to start with and get more complex over time, which works because a developer’s expertise with the database matures over time, too. With SQL, Doubrovkine says, it’s hard from the start and only becomes more complex at scale, if the requisite scale is even possible with RDBMS.

Of course, getting everything “perfect” from the start is difficult no matter what database technology you’re using. As noted, Mozilla and XEmacs developer Jamie Zawinski opines,

The design process is definitely an ongoing thing; you never know what the design is until the program is done.

One of the great things about NoSQL, in fact, is the ability to iterate on one’s data model as one’s business requires it.

That’s not to say that developers should go in blind. For some, checking forums and online documentation is enough. No matter the NoSQL database a developer may prefer, there is plenty of online documentation for each of them.

For others, hands-on training is preferred. In addition to standard, classroom-based training offered by DataStax, Basho, and other vendors that sponsor open-source NoSQL databases, there is also free online training. As just one indication of how strong demand is for NoSQL training, 10gen registered over 30,000 people for its inaugural online training.

Migration

Armed with information on how to best develop an application using NoSQL technology, the next step for many new users is to migrate away from the decades-old relational world they know.

But the bigger issue is careful planning of one’s migration.

With over 25 million users and 2.5 billion check-ins, Foursquare runs at serious scale. But it didn’t start that way. Though Foursquare now logs check-ins on Mars, just a few short years ago it logged its first check-in on Earth. As the company grew, Foursquare’s development team had to scramble to ensure its data infrastructure could keep up with its user adoption.

Foursquare originally started with MySQL. When Harry Heymann, Foursquare’s vice president of Engineering, joined in 2009, he moved Foursquare to PostgreSQL because it better suited the tools he was using. That all changed when the service took off with users, as Jon Hoffman, Foursquare’s storage infrastructure engineering lead, has indicated. Scaling PostgreSQL promised to involve significant work, so Heymann started reviewing other options, including MongoDB, Cassandra, CouchDB, and sharded MySQL.

Once Foursquare determined that MongoDB best fit its requirements, the company moved into MongoDB slowly given the requisite code changes and potential risk of breaking things in the transition. For a period of time, Foursquare duplicated data, storing the data in PostgreSQL and MongoDB in two parallel, synced sets. For one collection, this process took several months, as it already had one million users and significant traffic. For smaller collections, the data migrations were faster.

Doing so, however, paid off. One of the biggest, early wins was moving the geographic query functions from PostgreSQL to MongoDB, which enabled Foursquare to handle the same load with fewer resources.

Art.sy, which indexes and makes searchable high-quality images of 30,000-plus works of art from over 3,000 artists, also transitioned from RDBMS to NoSQL, though the transition process was much more straightforward than Foursquare. Its migration from relational to NoSQL happened while the company was still in the midst of a closed beta. One cold restart later, the company had moved from its relational beginnings to NoSQL.

Importantly, the process for data migration will depend on which NoSQL technology a company chooses. The process for moving RDBMS data into a columnar database like Cassandra differs from data migrations to key-value stores like Riak, or to MongoDB.

The process, in each case, largely involves the same four steps, as Kristina Chodorow and Michael Dirolf identify:

  • Get to know your NoSQL database. Download it, read the tutorials, try some toy projects.
  • Think about how to represent your model in its document store [or key/value, column, graph, as appropriate].
  • Migrate the data from the relational database to your NoSQL database, probably simply by writing a bunch of SELECT * FROM statements against the database and then loading the data into your NoSQL document [or key/value, column, graph] model using the language of your choice.
  • Rewrite your application code to query your NoSQL database through statements such as insert() or find().

This process will look different depending on the style of NoSQL database, but it’s a good rough guide.

Concluding remarks

Foursquare’s and Art.sy’s applications may be somewhat unique, but their need to embrace a flexible, scalable data infrastructure is not. Smart companies architect for scale from the very beginning, which generally will mean turning to NoSQL. For those that start with a relational database, all is not lost: the process for migrating from an RDBMS to NoSQL is now well-trod, with a great deal of information available online and offline to help with the process.

Cloud app vs. web app: Understanding the differences

The line between a cloud app and a web app remains as blurry as ever. This of course stems from the natural similarities that exist between them.   However, that there are noteworthy differences, especially when looking to leverage cloud apps for richer user customization experience and seamless integration with resilient and scalable back-end infrastructure, which often characterizes public cloud services.

Cloud app

A cloud app is the evolved web app. It’s equally used to access online services over the Internet like web apps but not always exclusively dependent on web browsers to work. It’s possible for a customizable, multi-tenancy cloud app to be solely available over the web browser from service providers, but quite often the web-interface is used as alternative access methods to the custom built cloud app for online services.

Cloud apps are usually characterized by advanced features such as:

  • Data is stored in a cloud / cloud-like infrastructure
  • Data can be cached locally for full-offline mode
  • Support for different user requirements, e.g., data backup cloud app with different features such as data compression, security, backup schedule
  • Can be used from web browser and/or custom built apps installed on Internet connected devices such as desktops, mobile phones
  • Can be used to access a wider range of services such as on-demand computing cycle, storage, application development platforms

Examples of cloud apps

Some common examples include Mozy, Evernote, Sugar Sync, Salesforce, Dropbox, NetSuite, and Zoho.com. Other qualifying examples such as web email (Google, Yahoo, Microsoft Hotmail, etc.) may not be so obvious, but they depend on cloud technology and are available off-line if consumers so choose to have them configured as such.

There are numerous websites where you can find useful information on cloud apps.  www.getapp.com to be particularly informative. It includes cloud app reviews and ratings to evaluate the apps.

Web apps

Web apps on the other hand are almost exclusively designed to be used from a web browser. A combination of server-side script (ASP, PHP etc) and client-side script (HTML, JavaScript, Adobe Flash) are commonly used to develop the web application. The web browser (thin client) relies on the web server components installed on backend infrastructure systems for the heavy lifting in providing its core functional web services.

The obvious benefit that this computing model provides over the traditional desktop app is that it is accessible from anywhere via the web browser. Cloud apps can also be accessed this way.

Examples of web apps

For many, web services such as WebEx, electronic banking, online shopping applications, and eBay fall into this category in as much as they are exclusively web-based with limited options for consumer customization.

Conclusion

Application service providers have been quick to exploit advantages brought about by pioneering web app building framework technologies for greater customer reach. However these technologies are not necessarily optimized for building new apps for the cloud era.

Cloud apps are web apps in the sense that they can be used through web browsers but not all web apps are cloud apps. Software vendors often bundle web apps to sell as “cloud” apps simply because it’s the latest buzz-word technology, but web apps do not offer the same richness in functionality and customization you’ll get from cloud apps. So, buyer beware!

Some software application vendors also falsely think that just because their application runs on the web, this automatically qualifies it to be a cloud app. This is not always the case. For your web app to evolve into a cloud app, it should exhibit certain properties such as

  • True multi-tenancy to support various requirements & needs for consumers
  • Support for virtualization technology, which plays a starring role for cloud era apps. Web applications should either be built to support this or re-engineered to do so

The good news is that vendors looking to move into this cloud app space now have rich development platforms and frameworks to choose from. Whether migrating from an existing web app or even starting from scratch. These new age cloud app development platforms are affordable and agile, reducing time to market and software development complexities.

VMware Cloud foundry, Google apps Engine, Microsoft Azure, Appcara, Salesforce (Heroku and Force.com), AppFog, Engine Yard, Standing Cloud, and Mendix are examples of such development platforms offering cloud-based technology for building modern applications.

Infographic: Security and the Cloud 2012

CloudPassage provides this infographic reflecting the answers of IT pros to its survey about cloud uses and security issues.

This infographic is provided by CloudPassage, a provider of server security products purpose-built for dynamic public and hybrid cloud hosting environments. This snapshot reflects a survey of IT professionals and their general attitudes toward cloud uses and security issues.

cloud-security-survey-infographic

Google Apps Vs Microsoft’s Office 365

Given the extent of development both Google Apps and Microsoft’s Office 365 have gone under recently, or just the sheer volume of features each service now offers, deciding which “office cloud” might be best suited for your small business or enterprise has become an extremely daunting task, abounding with risk. There are a number of factors to consider in order to avoid actually harming an organization’s productivity or drowning it in sunk costs and unforeseen expenditures. Furthermore, there are other trade-offs to choosing one service over the other, which go well beyond the basic set of productivity tools each offer — SLAs, application support, and maybe, especially, user culture and adoption.

To help lessen the apprehension with choosing Google Apps over Office 365, or vice versa,

First, we’ll look at the most basic office applications, productivity and document management apps.

Productivity applications

The applications that both Google Apps and Office 365 are most known for are their productivity suite apps that include a word processing, presentation, and spreadsheet software. These applications are known as Google Docs under Google Apps, and Office Web Apps under Office 365. Both suites also include some supplementary tools to accompany these core applications that can also be said to have been designed to promote the idea of increased office productivity. For Google Apps, this includes its Form and Drawing applications, while Office 365 offers an online version of its desktop OneNote software.

Seeing to it that probably most enterprise workers spend the majority of their day creating or collaborating upon word processor, presentation, or spreadsheet documents, this is where both Google and Microsoft have placed most of their emphasis in regards to development. For the most part, Google has focused on stripping down what Microsoft has built upon with its Office desktop software for years, by making a simple yet intuitive interface that users of productivity software, like Office, can easily navigate, without much of a learning curve. Microsoft has taken a similar approach, but takes any learning curve completely out of the equation by simply reducing certain parts, or advanced features, of its desktop Office 2010 software. Microsoft’s desktop and cloud versions of its office software are almost a spitting image of each other, somewhat analogous to how one may purchase a car. Choosing Office Web Apps is kind of like opting for a cheaper model of the same car, but minus the leather seats, faster engine, sunroof, and other fancy add-ons.

Both productivity suites will get the job done, meaning that 99 percent of the principal work that the majority of enterprise users do can be accomplished with either set of applications. However, for that remaining 1% of work that needs to get done, Google Apps is left behind, simply because Office Web Apps affords users the option of integrating with the desktop equivalent of Word, PowerPoint, Excel, and even OneNote, just in case more advance tasks, like creating VBA programs, is required.

Document management

There is more to just creating documents with productivity applications (e.g., word processors or spreadsheets) than just the applications themselves. There are a number of obligations one has to undertake in order to properly manage documents, and the data or information within them, effectively. Namely, in respect to what’s available on Google Apps/Google Docs and Office 365/Office Web Apps, there’s the sharing and collaboration of documents with peers, the ability to edit documents offline just in case an Internet connection is lost or unavailable, synchronization amongst cloud and desktop derived documents, document navigation and search, document importing and exporting, and document revisions/versioning.

Most enterprise users work in teams, and not only need to share their documents with teammates, but also collaborate with them upon those documents in real-time. Both Google Apps  and Office 365 have the ability to share, collaborate, or co-author documents in real-time, or in Office 365’s case, pseudo real-time. The real difference between the two might be a matter of preference — where Google Docs documents have a more straightforward approach to sharing and collaboration, Office 365 puts in place a number of mechanisms to prevent two authors from editing the same data at the time. Both can be noted as suitable for even the largest of enterprises though, especially when considering that this kind of technology is rather new, and really not available elsewhere, at least on a wide scale.

Offline editing, document syncing, and importing/exporting of documents are three closely related features as they both are enmeshed with the idea that the user demanding this kind of service, whether it be through Google Docs or Office Web Apps, is probably not intending to fully immerse themselves into the cloud. Both Google Apps and Office 365 have ample means for providing all of these features, but all have their reasons for concern when it comes to the large enterprise.

Document navigation and search might be a set of features formerly taken for granted, given that we all once used some kind of OS-based file managing software like Windows Explorer up until recently, when Internet browsers became necessary to access documents online. Although advances have been made in this territory, most will find that both services are lacking a certain “je ne sais quoi” – fluidity.

Document revision could be known as the act of both storing old copies of the same document with the ability to store a revision on the whim, while versioning takes this routine a step further by giving the author the ability to understand the differences between each version, as with CVS (Concurrent Versioning System). Although both Google Docs and Office Web Apps documents each have sufficient methods in place to keep track of a long history of revisions, any versioning feature is completely missing. This might only be necessary for advanced users, such as developers, but is something that can certainly deepen the argument for moving entirely to the cloud. Furthermore, it could be said that all users need to start using CVS.