أخبار
When it comes to data and cloud computing, think proactively 
8/3/2009
Data is moving to the cloud and has been for some time. However, when considering moving large data sets around the Internet, cloud to cloud or cloud to company, you have to consider the architectural trade-offs. And there are several. The core issue is that data residing in the cloud is perfectly fine, when considering performance and integrity, as long as it's within the same domain as the core applications and the processes that use the data. Thus, if your data resides on Amazon's EC2, the best approach is to place your applications and processes there as well. [ Stay up on the cloud with InfoWorld's Cloud Computing Report newsletter. | Confused by the cloud hype? Read InfoWorld's "What cloud computing really means" and watch our cloud computing InfoClipz. ] Why? It's all about the transmission of data requests and result sets. If the data is located in a different domain -- say, another cloud computing provider -- than the result sets, which are typically huge, all that data has to find its way back over the Internet to the requesting application or processes. Thus, the system suffers from the latency that comes along with moving a lot of data over the Internet. That's not the case if it is all within a single domain, whether cloud-delivered or on-premise. I'm seeing cloud computing performance issues coming up time and time again due to the fact that we now have cloud computing providers who provide a specific service component, such as database, development platform, or integration. Thus, thus you can get your database from one provider, your application development platform from another, and your process integration engine from a third. While this mixing and matching of fine-grained cloud-delivered IT resources is just fine in many instances, if you are consistently moving large amounts of data from cloud computing provider to cloud computing provider, or between on-premise systems and the clouds, then performance problems will surely arise. Moreover, you may find other problems as well, such as database integrity issues, including corruption and data loss. So what are the architectural guidelines when it comes to data and cloud computing? There are two main ones. 1. Consider the size of the result sets. Result sets that are consistently large should never be returned from a remote source, either via cloud computing or on-premise. They should be placed as close to the applications and processes that use the data as possible.  
4