<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>RainStor</title>
	<atom:link href="http://rainstor.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://rainstor.com</link>
	<description>Just another WordPress site</description>
	<lastBuildDate>Wed, 22 May 2013 16:19:21 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Every Byte is Sacred &#8211; The Life-cycle of Your Enterprise Data Record</title>
		<link>http://rainstor.com/every-byte-is-sacred-the-life-cycle-of-your-enterprise-data-record/</link>
		<comments>http://rainstor.com/every-byte-is-sacred-the-life-cycle-of-your-enterprise-data-record/#comments</comments>
		<pubDate>Fri, 17 May 2013 17:56:53 +0000</pubDate>
		<dc:creator>JoAnne McDougald</dc:creator>
				<category><![CDATA[Blog Entries]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=4011</guid>
		<description><![CDATA[By Deirdre Mahon, VP Marketing The typical life cycle of your average data transaction or event in the midst of all your data assets goes something like this; it comes screaming in&#8230; eh rather, streaming in to your enterprise along with millions of other records building up to billions each day where it is collected [...]]]></description>
				<content:encoded><![CDATA[<figure><a href="http://rainstor.com/2013_new/wp-content/uploads/2013/05/Meaning-of-Life.jpg"><img class="alignnone" alt="Every Byte is Sacred Blog Post" src="http://rainstor.com/2013_new/wp-content/uploads/2013/05/Meaning-of-Life.jpg" /></a></figure>
<p>By Deirdre Mahon, VP Marketing</p>
<p>The typical life cycle of your average data transaction or event in the midst of all your data assets goes something like this; it comes screaming in&#8230; eh rather, streaming in to your enterprise along with millions of other records building up to billions each day where it is collected and stored for some period of time.  You probably store it in a transactional or <a href="http://en.wikipedia.org/wiki/Relational_database" target="_blank">relational</a> database – maybe Oracle, DB2 or Sybase and depending on its nature and purpose in life, some degree of processing may take place – such as an update, augmentation, a transformation or it may just remain “as is”.  You may even store it in a <a href="http://en.wikipedia.org/wiki/Hadoop">Hadoop</a>-based environment to take advantage of low cost hardware and scale or because you want to perform <a href="http://www.predictiveanalyticsworld.com/predictive_analytics.php">predictive analysis</a> along with a set of <a href="http://rainstor.com/raw/">raw, detailed data</a> as opposed to using a standard BI tool to run a “canned” summary report at the end of the quarter.</p>
<p>A few months down the road the primary repository that collected all this streaming data reaches capacity limits and you figure that all that raw detailed data has probably run it’s course in terms of adding any further value so you decide to move it with other records downstream to a central data warehouse such as <a href="http://www.sybase.com/products/datawarehousing/sybaseiq">Sybase IQ</a> or <a href="http://www.teradata.com/">Teradata</a> where it is integrated with other records from other parts of the business – some may call this overall data enrichment.  Now it lives in the crème-de la crème of warehouses giving high performance for some pretty sophisticated analysis such as cross tabulating with other data sets, cross-referencing against core customer reference data residing in the central warehouse.  Business users are happy and they get an up-to-date accurate and broader picture of what is going on in the business quarter over quarter, compared against previous years.</p>
<p>Now what?  This aging data record (albeit a few quarters old) still sits in your expensive Teradata warehouse and it is not being accessed as frequently for the regular business reports and KPI dashboards.  As time goes by, the data record ages further and although it is crammed into the same warehouse with all the other (once current) data, it continues to be even less frequently “tapped”.  Its value has now dissipated in terms of helping the business analyst make short-term decisions and it is now hampering storage capacity growth.  Various options are evaluated &#8211; buy more expensive storage capacity, move the data record to offline tape or perhaps try to find a newer technology that will allow you to keep it online for longer.  As is often the case, once you put to tape, a business user comes knocking to gain access. However, budgets are maxed out and you are left with little choice so you take the path of offline archive tape.   In the process, you get down on your knees and pray that <a href="http://en.wikipedia.org/wiki/Audit">auditors</a> and <a href="http://en.wikipedia.org/wiki/Regulator">regulators</a> won’t come knocking for that specific record or that the business won’t need to analyze that far back, any time soon.</p>
<p>However, you want to be innovative and so continue to investigate lower cost database management and storage options.  You have looked at <a href="http://en.wikipedia.org/wiki/Private_cloud#Private_cloud">private cloud</a> and investigated Hadoop albeit a little risky because of it’s immaturity around meeting demanding query SLA’s in addition to the even more scary prospect of “what if it’s not secure and fully compliant with <a href="http://www.17a-4.com/regulations-summary/">Dodd Frank / WORM</a> compliance regulations”. Even though you have some data running on the cloud, your monthly bill is going up and increasing storage capacity in the cloud is not necessarily the answer you are looking for.</p>
<p>The last time you had to pull data records from offline tape it consumed about 2 weeks of services time, not to mention the one-month backlog now amassed due to other business units demanding tape restore for other audit purposes.  Although tape is initially cheap, it bites once you try to re-instate.  And that is not even taking into consideration the different source versions you need to manage over time.</p>
<p>Even though this now lonely record is a couple of years old, its value really never goes away.  The business doesn’t want to let go – what if they need to run reports comparing figures at the half-decade mark?  What if the regulators come knocking with a critical fraud investigation and this record is absolutely key to providing a clear, accurate picture.  Putting it on tape is a scary proposition and avoiding it at all cost is what you want to do.  Let’s face it; in the 21<sup>st</sup> century with all the technology capabilities now available, tape is antiquated.</p>
<p>You realize that <a href="https://en.wikipedia.org/wiki/Monty_Python%27s_Life_of_Brian"><i>every byte is sacred</i></a> and so figuring out the best database and storage solution based on the lifecycle and purpose is what drives your decision.  Now if only budget wasn’t such an issue.</p>
<p>Stay Tuned for the data lifecycle update and discover the range of options for your sacred enterprise data record.</p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/every-byte-is-sacred-the-life-cycle-of-your-enterprise-data-record/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GigaOM Research Highlights RainStor for Big Data Projects</title>
		<link>http://rainstor.com/gigaom-research-highlights-rainstor-for-big-data-projects/</link>
		<comments>http://rainstor.com/gigaom-research-highlights-rainstor-for-big-data-projects/#comments</comments>
		<pubDate>Tue, 14 May 2013 13:01:38 +0000</pubDate>
		<dc:creator>JoAnne McDougald</dc:creator>
				<category><![CDATA[Press Releases]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3977</guid>
		<description><![CDATA[Leading IT research group report indicates RainStor is a fraction of the cost of Oracle’s enterprise database, for handling large-scale analytics and archiving. San Francisco, California – May X, 2013 – RainStor, provider of world’s most efficient enterprise database for managing and analyzing all historical data, announces a new report from GigaOM Research highlighting the [...]]]></description>
				<content:encoded><![CDATA[<h2><i>Leading IT research group report indicates RainStor is a fraction of the cost of Oracle’s enterprise database, for handling large-scale analytics and archiving.</i></h2>
<p><b>San Francisco, California – May X, 2013 </b>– <a href="http://www.rainstor.com" target="_blank">RainStor</a>, provider of world’s most efficient enterprise database for managing and analyzing all historical data, announces a new report from <a href="http://pro.gigaom.com/">GigaOM Research</a> highlighting the RainStor database approach for Big Data projects. GigaOM’s “Managing Big Data without Breaking the Bank,” sponsored by RainStor, discusses the technical advantages and business benefits of deploying databases that focus on data compression and low-cost storage for “online analytic archives” to solve enterprise Big Data challenges.</p>
<p>The research analyst, George Gilbert, describes RainStor’s technology as a viable and cost-effective alternative for managing massive analytics projects: “Much of the explosion in data volumes that needs to be analyzed doesn’t need to be updated. In other words, it can be stored as an archive in a deeply-compressed format while still online for query and analysis.” Beyond data compression, Gilbert states that RainStor delivers greater analytic query performance, greater ingest speed of new data, deployment flexibility and ease of administration compared with traditional enterprise databases and data warehouses.</p>
<p>The report showed a startling cost comparison of Oracle Exadata against RainStor, across all costs including server hardware, storage licenses and administration; Oracle totaled out at $4.77 million and RainStor at $353,625. Gilbert explains:  “A company using the RainStor approach of maximum compression and traditional SQL queries can run on commodity hardware and with license and administrative costs that are materially less expensive. Because of the advanced data compression, this method needs significantly less storage and server hardware to process that smaller amount of data.”</p>
<p>The report also cited two RainStor customer examples: a large telecommunications provider that stored more data at 1/10th the cost of a traditional RDBMS and a large investment bank that saved millions of dollars over a three-year timeframe by using RainStor to maintain regulatory-driven archives of trading data. “Using RainStor as their online analytic archive solution, the bank expects that it will eventually shrink 30 petabytes (PB) of data down to 1PB,” Gilbert states in the report. The full report is downloadable and <a href="http://pages.rainstor.com/GigaOmWhitePaper_LP.html">available here</a>.</p>
<p>“RainStor’s proven database helps customers in compliance-intensive sectors drastically cut TCO – in other words <i>cheap and deep</i> analytics for long term historical data,” says John Bantleman, CEO at RainStor. “One of the recurring themes we see from customers is the need to reduce the cost of managing enterprise data warehouses, specifically Teradata. What’s more, we uniquely scale on any platform, from a content-addressed storage (CAS) device to a private cloud or native on HDFS, which further reduces cost and gives you tremendous flexibility to scale and grow.”</p>
<p><strong>About GigaOM Research</strong></p>
<p>GigaOM Research gives you insider access to expert industry insights on emerging markets. Focused on delivering highly relevant and timely research to the people who need it most, our analysis, reports, and original research come from the most respected voices in the industry. Whether you’re beginning to learn about a new market or are an industry insider, GigaOM Pro addresses the need for relevant, illuminating insights into the industry’s most dynamic markets.</p>
<p><b>About RainStor </b></p>
<p>RainStor provides the world’s most efficient database that reduces the cost, complexity and compliance risk of managing enterprise data. RainStor’s patented technology enables customers to cut infrastructure costs by 90%. RainStor scales anywhere; on-premise or in the cloud and natively on Hadoop. Among RainStor’s 150+ customers are the top 20 world’s largest communications providers, top 10 biggest banks and financial services organizations using RainStor to manage historical data, while saving millions. For more info: <a href="http://www.rainstor.com">www.rainstor.com</a> or join the conversation: @rainstor.</p>
<p>###</p>
<p>RainStor contact:</p>
<p>Kevin Wolf<br />
TGPR<br />
(650) 327-1641<br />
<a href="mailto:kevin@tgprllc.com">kevin@tgprllc.com</a></p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/gigaom-research-highlights-rainstor-for-big-data-projects/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dell and RainStor Team up for Webcast on Big Data in Banking</title>
		<link>http://rainstor.com/dell-and-rainstor-team-up-for-webcast-on-big-data-in-banking/</link>
		<comments>http://rainstor.com/dell-and-rainstor-team-up-for-webcast-on-big-data-in-banking/#comments</comments>
		<pubDate>Fri, 10 May 2013 18:15:09 +0000</pubDate>
		<dc:creator>JoAnne McDougald</dc:creator>
				<category><![CDATA[Press Releases]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3962</guid>
		<description><![CDATA[The event will highlight best practices for addressing infrastructure challenges at large financial institutions involved in massive data analytics and archiving projects. San Francisco, California – May 10, 2013 – RainStor, provider of the world’s most efficient enterprise database for managing and analyzing all historical data, will host a webinar on Tuesday, May 14th at [...]]]></description>
				<content:encoded><![CDATA[<h2><i>The event will highlight best practices for addressing infrastructure challenges at large financial institutions involved in massive data analytics and archiving projects.</i></h2>
<p><b>San Francisco, California – May 10, 2013 </b>– <a href="http://www.rainstor.com">RainStor</a>, provider of the world’s most efficient enterprise database for managing and analyzing all historical data, will host a <a href="http://pages.rainstor.com/IIADellRainStorWebcast_RSLP.html">webinar</a> on Tuesday, May 14th at 11 AM Pacific, titled, “Banking on Big Data? Reduce Risk, Stay Compliant and Reduce Cost.”  The event will feature a discussion of best practices for a new era in analytics from Sarah Gates, VP of research at the International Institute for Analytics (<a href="http://iianalytics.com/" target="_blank">IIA</a>), followed by a technology presentation by Remi Bello, Senior Consultant, Marketing Strategy &amp; Analytics at Dell and Ramon Chen, VP of Product Management at RainStor. Dell partners with RainStor offering a unified big data software and storage platform for customers who need to retain large volumes of structured and semi-structured data sets at a much lower cost per terabyte.</p>
<p>“Banks have long been considered innovators when using data analytics to tackle numerous business challenges such as risk management, price discovery and fraud detection,” says Bello. “Yet today, the volume of data is much bigger and more diverse than ever before. At the same time, regulatory demands for banks have become much more stringent and storage requirements have increased as growth rates continue to increase and the business wants to keep data for longer periods of time.” This complexity, he says, has necessitated new approaches and technologies for Big Data infrastructure.</p>
<p>“Banks continue to struggle to provide fast access to their growing stores of historical data for better intelligence and at the same time keep auditors happy,” says Ramon Chen at RainStor. “Yet the cost of managing and growing the underlying infrastructure is no longer sustainable. Through our partnership with Dell, we are bringing to market the most cost-effective solution to enable banks to not only stay compliant but to help make sound business decisions around the rich source of market and operational data.”</p>
<p><b>Registration for the free online event is available at </b><a href="http://pages.rainstor.com/IIADellRainStorWebcast_RSLP.html" target="_blank">this page</a><b>. </b></p>
<p><b>About Dell </b></p>
<p>Dell Inc. (NASDAQ: DELL) listens to customers and delivers innovative technology and services that give them the power to do more. For more information, visit <a href="http://www.dell.com/">www.dell.com</a>.</p>
<p><b>About RainStor </b></p>
<p>RainStor provides the world’s most efficient database that reduces the cost, complexity and compliance risk of managing enterprise data. RainStor’s patented technology enables customers to cut infrastructure costs by 90%. RainStor scales anywhere; on-premise or in the cloud and natively on Hadoop. Among RainStor’s 150+ customers are the top 20 world’s largest communications providers, top 10 biggest banks and financial services organizations using RainStor to manage historical data, while saving millions. For more info: <a href="http://www.rainstor.com">www.rainstor.com</a> or join the conversation: @rainstor.</p>
<p>###</p>
<p>RainStor contact:<br />
Kevin Wolf<br />
TGPR<br />
(650) 327-1641</p>
<p><a href="mailto:kevin@tgprllc.com">kevin@tgprllc.com</a></p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/dell-and-rainstor-team-up-for-webcast-on-big-data-in-banking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Database Interoperability is No Laughing Matter</title>
		<link>http://rainstor.com/database-interoperability-is-no-laughing-matter/</link>
		<comments>http://rainstor.com/database-interoperability-is-no-laughing-matter/#comments</comments>
		<pubDate>Mon, 06 May 2013 19:30:27 +0000</pubDate>
		<dc:creator>JoAnne McDougald</dc:creator>
				<category><![CDATA[Blog Entries]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3926</guid>
		<description><![CDATA[by Mark Cusack, Chief Architect I was asked to write about how we connect and exchange data with other databases.  My first response was “Oh Mum, do I have to?”  My second response was to take a nap.  The problem with data movement is that it’s dull.  Records are of interest when they are laid [...]]]></description>
				<content:encoded><![CDATA[<figure><a href="http://rainstor.com/2013_new/wp-content/uploads/2013/05/Fred-and-Ginger.png"><img class="alignnone" alt="Fred and Ginger" src="http://rainstor.com/2013_new/wp-content/uploads/2013/05/Fred-and-Ginger-300x241.png" /></a></figure>
<p>by Mark Cusack, Chief Architect</p>
<p>I was asked to write about how we connect and exchange data with other databases.  My first response was “Oh Mum, do I have to?”  My second response was to take a nap.  The problem with data movement is that it’s dull.  Records are of interest when they are laid seductively across a table.  There they can be selected, projected, joined, grouped, aggregated and ordered.  When they are serialized and flying down a pipe?  Not so much.</p>
<p>How is it possible make the boring subject of <a href="http://rainstor.com/products/connectors/fast-connect/">database connectors</a> interesting?  Well one way is to introduce parallelism.  Concurrency is cool, and concurrency equals speedup.  The faster you can move data to somewhere interesting where you can do things with it, the better in my book.  And the king of concurrency in the data warehousing business is <a href="http://www.teradata.com/products-and-services/Teradata-Database/#tabbable=0&amp;tab1=0&amp;tab2=0&amp;tab3=0&amp;tab4=0">Teradata</a>.  So what I want to talk about is our parallel connector to Teradata: RainStor <i>FastConnect</i>™</p>
<p>We spent a year or so working closely with the boys and girls at Teradata Labs in San Diego, integrating their EDW platform with RainStor.  I was directly involved in that project.  In the end, it didn’t make sense to progress with the original plan to deliver a RainStor EDW archiving appliance.  But the legacy of that integration work lives on in the form of a bi-directional, high-bandwidth connector between Teradata and RainStor.</p>
<p>Why would you want to transfer data from Teradata to RainStor?  In two words: tape avoidance.  Tape is the worst place for records to live.   If, like me, you think that data in flight is inaccessible and therefore lacks value, think about tape.  It’s where records go to die.  It’s also a serious problem in today’s environment of increasing regulation and data governance.</p>
<p>Rather than consigning your <a href="http://rainstor.com/cold-bold/" target="_blank">older, colder</a> EDW transactions to tape via <a href="http://www.teradata.com/Backup-Archive-and-Restore/#tabbable=0&amp;tab1=0&amp;tab2=0&amp;tab3=0">Teradata’s Backup Archive Restore </a>(BAR), solution you can also send those records to RainStor.  The records remain on-line and query-able via RainStor’s SQL engine.  If you operate in a heavily regulated industry like finance, you can run the SEC’s reports against RainStor, rather than going through the time-consuming and expensive process of restoring the data from tape to your Teradata EDW first.</p>
<p>One of the key reasons for having a scalable and efficient way of transferring data from Teradata to RainStor is to reduce the time spent on archiving.  Archiving is a cost-center rather than a profit-center activity.  You don’t want to devote your expensive production Teradata system or DBOs to archiving tasks for long.  You want to lift and shift those records to RainStor quickly, so that you can crack on with the important job of running the analytics required to power your business.   Moving those records to cheap RainStor storage means you’ve freed-up space on your expensive data warehouse, making it more efficient, and delaying the need to scale out capacity on the Teradata side.</p>
<p>Teradata has two efficient protocols for moving data in and out.  These are FastExport and FastLoad.  The problem with these methods is that although they scale out well over the AMPs on the Teradata system, all that data comes in or out of a single point on the client-side.  There’s a network and a client-side processor bottleneck that constrains the throughput in and out of Teradata.  The answer is to use Teradata’s TPTAPI and RainStor FastConnect to scale out on the client-side.</p>
<p>On the one side you have an <i>N </i>node Teradata EDW.  On the other side you have <i>M </i>RainStor servers in a cluster.  FastConnect supports <i>N:M</i> connectivity and data transfer between the two systems.  Not only can you pull data out of a Teradata table and into an equivalent RainStor table, you can move it back in the other direction too.  So if there’s specific, deep analytics you need to perform on Teradata, which RainStor’s SQL92 engine can’t handle, you can transfer the data back into an empty table on Teradata and perform it there.</p>
<p>So what about the performance?  Rather than embark on an in-depth analysis about how <i>fast</i> FastConnect is, I’ll adopt the cavalier attitude of directly quoting the results of the last telephone conversation I had about it.  In a current experiment with a very large US telco customer of ours, we’re seeing a transfer rate of around 60MB/second into a RainStor server over a 1GbE connection to a Teradata EDW.   And this is linearly scalable across RainStor servers.  Not bad.  I’ve seen rates as high as 80MB/s/server in the lab.</p>
<p>Teradata and RainStor work well together.   Data transfer rates are fast, and the numbers scale.  We’ve also got tools to help with the extraction and conversion of Teradata DDL into RainStor DDL; one table defined in Teradata equals one table defined in RainStor.  On both Teradata and RainStor, you describe exactly the data you want to move to the target system by running a parallel SQL query on the source system.</p>
<p>As time goes on we’re targeting even deeper integration, to make the process of transferring records to and from Teradata as seamless as possible.  We’ll add automated policy definition and validation tools to ensure that the right data is moved to the right place at the right time.  Query federation across Teradata and RainStor systems is also an area of interest to us moving forward.</p>
<p>Have I delivered on the promise of making database connectivity interesting?   I’m not so sure.   But what I am sure of is that few things go together as well as Teradata and RainStor.  I’m sure Fred and Ginger would agree.</p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/database-interoperability-is-no-laughing-matter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dell to co-host webinar with RainStor and the International Institute for Analytics (IIA) on big data within banking</title>
		<link>http://rainstor.com/dell-to-co-host-webinar-with-rainstor-and-the-international-institute-for-analytics-iia-on-big-data-within-banking/</link>
		<comments>http://rainstor.com/dell-to-co-host-webinar-with-rainstor-and-the-international-institute-for-analytics-iia-on-big-data-within-banking/#comments</comments>
		<pubDate>Wed, 01 May 2013 22:17:55 +0000</pubDate>
		<dc:creator>JoAnne McDougald</dc:creator>
				<category><![CDATA[Blog Entries]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3913</guid>
		<description><![CDATA[by Remi Bello of Dell.  Skilled at leveraging big piles of data for business value, banks have long been considered innovators for their use of analytics to tackle such critical banking functions as risk management, price discovery and fraud detection. However, things are different today. The piles of data are much bigger, faster, and far [...]]]></description>
				<content:encoded><![CDATA[<p>by <a href="http://www.linkedin.com/in/remibello" target="_blank">Remi Bello of Dell.  </a></p>
<p>Skilled at leveraging big piles of data for business value, banks have long been considered innovators for their use of analytics to tackle such critical banking functions as risk management, price discovery and fraud detection. However, things are different today. The piles of data are much bigger, faster, and far more unstructured and complex than they ever have been, necessitating new industry responses.</p>
<p>For example, with Dodd-Frank and Basel III, regulators demand that banks provide increased transparency and risk mitigation to allay concerns. Regulatory compliance is driving the need for more holistic and regular reporting and auditing. This has contributed to a spike in the amount of data stored and the length of time it is stored, which makes the data increasingly more complex to manage and more difficult to derive business value from.</p>
<p>Perhaps a 2010 Gartner data management survey of financial services firms best underscores this trend. It found that only 43% of responding firms give the quality of their data for supporting operations a high rating. Similarly, only 33% gave the quality of their data for supporting business intelligence and management decision making a high rating.</p>
<p>But now a solution exists that helps banks reduce the cost and complexity of retaining big data while also improving data retrieval and analysis. The <a href="http://www.dellstorage.com/WorkArea/DownloadAsset.aspx?id=2904">Dell-RainStor Big Data Retention Solution</a> combines Dell storage (including the <a href="http://www.dellstorage.com/dx-object/">Dell DX Object Storage Platform</a>), <a href="http://www.dell.com/Learn/us/en/555/by-service-type?c=us&amp;l=en&amp;s=biz&amp;cs=555&amp;delphi:gr=true">Dell Services</a>, and <a href="http://rainstor.com/products/rainstor-database/">RainStor database technology,</a> to help reduce the cost of retaining big data through data reduction, simplified data management and near-perfect scalability, which can provide easy access to the data by standard SQL BI/analytics tools.</p>
<p>Join Dell, RainStor and the International Institute for Analytics (IIA) for a webinar titled <a href="http://pages.rainstor.com/IIADellRainStorWebcast_DellLP.html">“Banking on Big Data? Reduce Risk, Stay Compliant and Reduce Cost”</a> to explore innovative ways in which banks can reduce the cost and complexity of retaining big data. The IIA will introduce its acclaimed <a href="http://iianalytics.com/a3/">Analytics 3.0</a> framework, outlining how banks can bring together the best of traditional analytics and big data to drive business impact.</p>
<p><strong>Date:</strong> Tuesday, May 14, 2013</p>
<p><strong>Time:</strong> 11:00am PT / 2:00pm ET</p>
<p><strong>Duration:</strong> 1 Hour</p>
<p><strong>Registration:</strong> <a href="http://bit.ly/14RwctL">http://bit.ly/14RwctL</a></p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/dell-to-co-host-webinar-with-rainstor-and-the-international-institute-for-analytics-iia-on-big-data-within-banking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>But Hadoop Already Does This!</title>
		<link>http://rainstor.com/hadoop-this/</link>
		<comments>http://rainstor.com/hadoop-this/#comments</comments>
		<pubDate>Fri, 26 Apr 2013 17:45:03 +0000</pubDate>
		<dc:creator>JoAnne McDougald</dc:creator>
				<category><![CDATA[Blog Entries]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3876</guid>
		<description><![CDATA[by Mark Cusack, Chief Architect, RainStor Jimmy Hill is a footballing legend in the UK.  His name is also used as the standard refrain when exclaiming disbelief in a particular notion.  So, for example, when prospects, customers or my wife try explain to me that Hadoop already does all of the things that RainStor can [...]]]></description>
				<content:encoded><![CDATA[<p>by Mark Cusack, Chief Architect, RainStor</p>
<figure><a href="http://rainstor.com/2013_new/wp-content/uploads/2013/04/JimmyHill.jpg"><img class="alignleft" alt="JimmyHill" src="http://rainstor.com/2013_new/wp-content/uploads/2013/04/JimmyHill-220x300.jpg" /></a></figure>
<p><a href="http://en.wikipedia.org/wiki/Jimmy_Hill" target="_blank">Jimmy Hill</a> is a footballing legend in the UK.  His name is also used as the standard refrain when exclaiming disbelief in a particular notion.  So, for example, when prospects, customers or my wife try explain to me that <a href="http://en.wikipedia.org/wiki/Hadoop">Hadoop</a> already does all of the things that RainStor can do, I respond by saying: “Jimmy Hill it does,” and by stroking my chin mockingly.</p>
<p>I have to admit that since moving to the Bay Area three years ago, the Jimmy Hill putdown has lost the devastating impact that it had in Britain.  Albeit Britain in the 1970s.  But it’s a tough habit to break, and so what I want to do in this post is examine the claims made about Hadoop’s current capabilities from the reassuringly familiar perspective of Jimmy’s protuberant chin.</p>
<p><b>Jimmy Hill Loves Hadoop</b></p>
<p>Before reviewing the evidence against the assertion that Hadoop can do everything, let me get a few housekeeping items out of the way.  We love Hadoop at RainStor.  We first released a version of RainStor that stores its data files directly on <a href="http://rainstor.com/rainstor-delivers-big-data-retention-on-clouderas-distribution-including-apache-hadoop/">HDFS early in 2011</a>, making us the first SQL database to run natively on Hadoop.</p>
<p>In the period since, we’ve added the ability for Pig, Hive and MapReduce jobs to directly access data stored in RainStor’s compressed files on HDFS.  While 80% of our deployments may be on non-Hadoop platforms, 80% of our current customer our proof-of-concepts going forwards are Hadoop-based.  Hadoop helps RainStor to scale, and we provide Hadoop with a low-latency SQL capability: win-win.</p>
<p><b>Claim 1: Impala Already Provides Fast SQL Queries</b></p>
<p>The first claim that has me reaching for my chin is that Hadoop already provides fast SQL query in the form of Impala or the Stinger initiative.  These projects will be great when they are finished.  But the simple position right now is that neither of them are GA.</p>
<p>It takes a great deal of effort to build an enterprise-class SQL database.  It’s taken us a long time at RainStor – perhaps we’re a bit slow.  We’re still smoothing out some of the rough edges and working to eliminate pesky corner-cases, and we’ve been GA since 2009.</p>
<p><b>Claim 2: Hadoop Already Has Efficient Column &amp; Compression Formats</b></p>
<p>The second claim is that Hadoop has, and is, working on column-based compression, and so <a href="http://rainstor.com/products/rainstor-database/compress/">RainStor’s proprietary compressed</a> file format does not add value.  Well it’s certainly true that columnar-file formats are available and are an active area of Hadoop development.  Can they achieve the sorts of compression factors that RainStor achieves?  Jimmy Hill’s not so sure – partly because the compression efforts in Hadoop tend to be more concerned with improving analytics performance than with reducing disk footprint.  Also, our approach takes us beyond what <a href="http://en.wikipedia.org/wiki/Column-oriented_DBMS">column-oriented schemes</a> do, by additionally identifying patterns of replication between columns.</p>
<p>It’s not all about size, of course.  We take file versioning and backwards compatibility very seriously.  If you trust us to look after your data, you want assurance that it will be <a href="http://rainstor.com/products/rainstor-database/security-compliance/">accessible through all versions</a> of our software and all versions of your schema.  We encapsulate the schema and other metadata within our file format, so files are time-stamped, independent, self-describing slices of the table they belong to.</p>
<p><b>Claim 3: Hadoop Does Data Lifecycle Management</b></p>
<p>The third claim leveled is that Hadoop is a long-term data archiving solution in its own right.  I think this one is the easiest challenge to dismiss, to the extent that I don’t even bother asking Jimmy to warm up on the touchline.  If you’re trying to solve a multi-year archiving problem, then you need mechanisms in place that deal with constantly evolving metadata and data, and Hadoop simply does not have them.</p>
<p>Schema changes are a fact of life, and being able to modify and version schemas on Hadoop is a compelling feature of RainStor.  As is the ability to automatically expire data from tables based on rules.  You may be legally obliged to hold onto records for a certain period of time, but no longer.  Equally, an on-going legal case may require that a set of records be held indefinitely, while those outside the set are expired from the system.  RainStor supports this, and we have production deployments that do this.</p>
<p>Hadoop doesn’t have mature data lifecycle management features right now.  Project Falcon has just entered Apache incubation, and Cloudera recently announced its <a href="http://blog.cloudera.com/blog/2013/02/new-products-releases/">Navigator product</a>.  Both offer data lifecycle management functions, but clearly both are at a very early stage.  Do they have the features required to operate in heavily regulated financial services or telco environments right now? Jimmy Hill do they.</p>
<p><b>Claim 4: Hadoop Implements an Enterprise-Grade Security Model</b></p>
<p>Hadoop’s security model is not fully developed.  The ability to authenticate users via LDAP or Active Directory might be possible, but support for role-based access control isn’t there.  Neither is <a href="http://rainstor.com/products/rainstor-database/security-compliance/">encryption of data</a> at rest or on the move.  RainStor has these features right now.</p>
<p>RainStor also has tamper detection mechanisms and audit trails.  We record the ‘who, what and when’ for any operation performed against the archive, and store those logs within our own system archive.  These audit trails are fully queriable using our own SQL engine.  Our immutable data model underpins all of these features.</p>
<p><b>Claim 5: Open Source Avoids Lock-In</b></p>
<p>I often have to respond to the assertion that choosing RainStor means choosing to be locked into a proprietary technology.  Lock-in and open source are not mutually exclusive.  If you contract Cloudera to engineer the full solution or to provide you with support for Impala, for example, then you’ve made a commitment to move forward with Cloudera.  Could others offer Impala support later down the line?  Unless they’ve got folks who deeply understand the codebase, then my response is once more, “Jimmy Hill!”</p>
<p>For our part, we ensure that you will always be able to get your data back out of RainStor, in whatever format you choose.  We don’t believe in license keys and we don’t practice data lock-in.  We added Hive, Pig and MapReduce support to ensure you can interchange data freely and efficiently with RainStor and any other data format on Hadoop.</p>
<p><b>The Reality: Hadoop and RainStor are Complementary</b></p>
<p>You find that if you scratch at the surface of any of these claims, the reality is that the standard Hadoop stack provides only a subset of the technology needed to build a data archiving solution, due to a lack of functionality and maturity.  The shortfall needs to be made up by professional services supplied by commercial Hadoop shops, or through in-house engineering – or by using a third party product like ours.</p>
<p>Hadoop provides RainStor with a stable, scalable storage and compute framework.  RainStor provides Hadoop with the layer needed to safely and securely manage huge amounts of data over very long timescales.  At the multi-petabyte level I don’t think that Hadoop or RainStor alone provide 100% of the data archiving solution.  They need each other, and they need to integrate with many other parts of the enterprise too.  I’m as happy as the next believer to worship at the altar of Hadoop, but it can’t and won’t do it all on its own.</p>
<p>I like to think that Jimmy Hill believes this too.</p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/hadoop-this/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RainStor Named to Inaugural CRN Big Data 100 List</title>
		<link>http://rainstor.com/rainstor-named-inaugural-crn-big-data-100-list/</link>
		<comments>http://rainstor.com/rainstor-named-inaugural-crn-big-data-100-list/#comments</comments>
		<pubDate>Mon, 22 Apr 2013 18:48:08 +0000</pubDate>
		<dc:creator>deirdre-mahon</dc:creator>
				<category><![CDATA[Press Releases]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3864</guid>
		<description><![CDATA[RainStor’s Big Data database technology, used today by the world’s largest banks and telcos, reduces data footprints by up to 40x and storage costs by as much as 90% San Francisco, California – April 22, 2013 – RainStor has been recognized by a top technology publication for its success helping the world’s largest companies reduce [...]]]></description>
				<content:encoded><![CDATA[<p><em>RainStor’s Big Data database technology, used today by the world’s largest banks and telcos, reduces data footprints by up to 40x and storage costs by as much as 90%</em></p>
<p><b>San Francisco, California – April 22, 2013 </b>– <a href="http://www.rainstor.com" target="_blank">RainStor</a> has been recognized by a top technology publication for its success helping the world’s largest companies reduce Big Data storage costs.  UBM Tech Channel’s CRN published its inaugural Big Data 100 list <a href="http://www.crn.com">today</a>, including RainStor and other innovators whose products and services have significantly enhanced Big Data management and analysis.</p>
<p>RainStor is a pioneer in the field of Big Data, known for its patented compression technology, which greatly reduces the storage footprint and therefore costs, so customers can easily meet future growth.  Its product, developed to address Big Data by storing everything once and never repeating anything enables the most efficient scale to store, manage and analyze extreme data volumes in the most cost effective way.  RainStor guarantees 90% savings in data storage costs and customers typically see compression rates of 20-40X. Customers include many of the world’s largest banks, telecommunication providers and government agencies. These enterprises benefit from both data storage reduction and fast query access using standard SQL, BI Tools, Hive, MapReduce and Pig when running natively on Hadoop.</p>
<p>“The 2013 Big Data 100 list recognizes vendors that have shown a dedication to the innovation and advancement of today’s Big Data services, and have evolved to meet the needs of today’s business leaders,” said Kelley Damore, Senior Vice President and Editorial Director, CRN. “Companies need technologies that not only help them process and manage all that data, but also offer market insights.  The Big Data 100 vendors set themselves apart with the cutting edge products and services that provide significant growth opportunities for solution providers to build their business and effectively compete in today’s information economy.”</p>
<p><strong>“Data storage costs are outpacing enterprise data growth rates,” said Deirdre Mahon, VP Marketing at RainStor. “We typically see a bank’s data growth as high as 70% annually, which is not sustainable with current IT budgets. RainStor’s technology vastly reduces storage costs as a barrier to success in Big Data projects and speeds up analysis so customers get insights faster.  We are honored to have been recognized by CRN Magazine for our role in the Big Data industry.”</strong><strong></strong></p>
<p><b>About UBM Tech Channel </b>(<a href="http://www.ubmchannel.com">www.ubmchannel.com</a>)</p>
<p>UBM Tech Channel, a UBM company, is the premier provider of IT channel-focused events, media, research, consulting, and sales and marketing services. With more than 30 years of experience and engagement, UBM Tech Channel has the unmatched channel expertise to execute integrated solutions for technology executives managing partner recruitment, enablement and go-to-market strategy in order to accelerate technology sales. To learn more about UBM Tech Channel, visit us at: <a href="http://www.ubmchannel.com/">www.ubmchannel.com</a>.</p>
<p><b>About RainStor </b></p>
<p>RainStor is the only enterprise database that consistently delivers 90% cost savings to customers facing Big Data and Data Warehousing challenges. RainStor’s patented technology provides the industry’s highest rate of data compression, high performance, and flexible on-demand query. RainStor is storage agnostic and can run on-premise or in the cloud and natively on Hadoop. The largest enterprises use RainStor to predictably scale, improve operational efficiency and meet compliance requirements while saving millions in costs. For more info: <a href="http://www.rainstor.com">www.rainstor.com</a> or join the conversation: @rainstor.</p>
<p>###</p>
<p>RainStor contact:<br />
Kevin Wolf<br />
TGPR<br />
(650) 327-1641<br />
<a href="mailto:kevin@tgprllc.com">kevin@tgprllc.com</a></p>
<p>UBM contact:<br />
Betzi-Lynn Hanc<br />
UBM Tech Channel<br />
(508) 416-1182<br />
<a href="mailto:betzi.hanc@ubm.com">betzi.hanc@ubm.com</a></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/rainstor-named-inaugural-crn-big-data-100-list/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What Lies Beneath Big Data Costs? Storage.</title>
		<link>http://rainstor.com/lies-beneath-big-data-costs-storage/</link>
		<comments>http://rainstor.com/lies-beneath-big-data-costs-storage/#comments</comments>
		<pubDate>Fri, 12 Apr 2013 20:01:42 +0000</pubDate>
		<dc:creator>deirdre-mahon</dc:creator>
				<category><![CDATA[Blog Entries]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3845</guid>
		<description><![CDATA[by Deirdre Mahon, VP Marketing In information-intensive industries such as Financial Services, Communications, Media, Retail and Utilities, Big Data management and analysis is an absolute must-have strategy this year. Yet CIOs everywhere are trying to figure out what to do about the staggering costs of how and where to store the hundreds of terabytes and [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.marketplace.org/topics/business/big-data-creates-big-industry-storing-data" target="_blank"><img class="alignnone" alt="2013-04-11_16-19-37" src="http://rainstor.com/2013_new/wp-content/uploads/2013/04/2013-04-11_16-19-37.jpg" /></a><br />
by Deirdre Mahon, VP Marketing</p>
<p>In information-intensive industries such as Financial Services, Communications, Media, Retail and Utilities, Big Data management and analysis is an absolute must-have strategy this year. Yet CIOs everywhere are trying to figure out what to do about the staggering costs of how and where to store the hundreds of terabytes and petabytes of data they are collecting and hoping to analyze for business gain.</p>
<p>In a <a href="http://www.marketplace.org/topics/business/big-data-creates-big-industry-storing-data">recent report</a> from NPR, an Aberdeen analyst was quoted as saying that companies are spending 12% of their IT budgets on storage, costs which are doubling every two years. A CTO of a startup also quoted in the story noted that he migrated his company’s data to Amazon to save money, yet 15% of his IT budget still goes toward storage.</p>
<p>That’s a high percentage, considering that storage doesn’t provide any value in itself to a company. It’s the business application users and the data analysts who innovate by leveraging rich data assets and driving new decisions to connect with customers. If Big Data is hampered by storage costs, companies don’t have enough money to put toward the analytics side of the equation, which is what ultimately drives better business results. Most data and analytics experts will tell you that as an industry, we’ve only scratched the surface of mining multi-structured data. Solving the storage issue is critical, so companies can get to the next level with their Big Data initiatives.</p>
<p>Of course, organizations with deep pockets can get around this problem by throwing more hardware into their data centers. Yet how sustainable is this? How large can these data centers grow and is this a responsible thing to do for the balance sheet, much less the environment? As well, there are the hidden costs of storage, such as backups and of course the necessary test and dev environments. The larger your data footprint, the more you’ll pay for replicating it and protecting it.</p>
<p>There is another way: invest in technologies that more efficiently manage and store your data. RainStor has focused on the data compression challenge, for this very reason. Using our database, companies are saving 90% on data storage costs. Our patented <a href="http://rainstor.com/products/rainstor-database/compress/">data compression technology</a> is often five times higher than competing technologies. We are tackling the problem at the root– storing everything once and never repeating anything which massively shrinks the data storage footprint and in turn dramatically reduces your physical or cloud server needs and data center space.</p>
<p>At the tremendous rate of data growth, there’s really no better answer.  Check out <a href="http://wikibon.org/blog/big-data-statistics/">this resource</a> for some mind-boggling statistics. As an industry, we have to focus on data compression first, and then look to investing in highly efficient storage devices, virtualization and cloud services to be even more efficient and cost effective.</p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/lies-beneath-big-data-costs-storage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Choosing your Data Store: Generalize, Specialize or Compromise?</title>
		<link>http://rainstor.com/choosing-data-store-generalize-specialize-compromise/</link>
		<comments>http://rainstor.com/choosing-data-store-generalize-specialize-compromise/#comments</comments>
		<pubDate>Tue, 09 Apr 2013 18:07:13 +0000</pubDate>
		<dc:creator>Jyothi Swaroop</dc:creator>
				<category><![CDATA[Blog Entries]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Data Archive]]></category>
		<category><![CDATA[Data Compression]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[Rainstor]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3827</guid>
		<description><![CDATA[by Mark Cusack, Chief Architect, RainStor &#160; Way back in 2004, when we were small boys chipping away at the early codeface, we knew we weren’t going to grow up to be like all the other databases out there.  We didn’t want to be a general-purpose database platform, because we knew the bigger boys like [...]]]></description>
				<content:encoded><![CDATA[<p><strong>by Mark Cusack, Chief Architect, RainStor</strong></p>
<figure><a href="http://rainstor.com/2013_new/wp-content/uploads/2013/04/3stooges.png"><img class="aligncenter" alt="3stooges" src="http://rainstor.com/2013_new/wp-content/uploads/2013/04/3stooges-300x224.png" /></a></figure>
<p>&nbsp;</p>
<p>Way back in 2004, when we were small boys chipping away at the early codeface, we knew we weren’t going to grow up to be like all the other databases out there.  We didn’t want to be a general-purpose database platform, because we knew the bigger boys like Oracle would laugh at us and call us names.  So we carved out a niche for ourselves as a database archive.  We designed the system to be cool with handling data at scale, and with keeping it safe.  We’ve learned over the years by making mistakes, and by being taken back to school by some of our earliest prospects and customers, who patiently explained to us precisely what they expect of a system that will store and manage their valuable data for the long haul.</p>
<p>Our database technology runs on many different storage platforms.   We’ve got RainStor production deployments on NAS, SAN, CAS, Cloud and Hadoop.  We like to say that we’re storage-agnostic:  we’ll pretty much deliver the same level of functionality and performance on wherever and whatever you want to store your data on.   It’s a bold claim, and one that we’re frequently measured against during the proof of concept phase.  We work ‘as advertised’ because we make compromises in order to be exactly the kind of long-term data archiving solution we want to be.</p>
<p>First, we give up the strict ability to perform transactional updates to records.  You can perform bulk append operations on tables in RainStor, but you can’t directly modify data once it’s in one of our tables.  If your data takes the form of closed business transactions or log data, then that’s great!  Come on in, make yourself comfortable and relax the grip on your wallet.  It doesn’t make sense for your data to change, and we want to do everything in our power to prevent that from happening.  After all, in some of the tough regulatory environments we operate in, changing records gets you fired or imprisoned, which can be really inconvenient.</p>
<p>But specializing in immutable data helps us in many ways.  First, when we append records to a table, we gather perfect statistics on the new data, which we can use later down the line when we want to query it.  We can avoid costly table locking operations across a cluster of RainStor servers, as we don’t need to worry about synchronizing access to records.  So this means that data loads and queries can happen at the same time.</p>
<p>We take this further by asynchronously copying metadata describing newly appended records around the cluster.   This eventually consistent scheme helps us to scale out RainStor efficiently across servers.  It also means it may take a few seconds for newly added records to be available to query, particularly in a heavy mixed workload environment.  But if you’ve just added an extra 1 million records to a table that already contains 1 trillion entries, perhaps representing a year’s-worth of stock trade events, then you probably don’t care so much.</p>
<p>We also specialize in storing structured data.  Data with a fixed schema, such as you’d get from a traditional relational database.   We built our database around our own compression technology, which works by de-duplicating single- and compound-field values across a table.  Generally speaking, the larger the field value, the greater the entropy, and the less scope for removing repetition.  Yes, you can store BLOBs in RainStor tables, and some folks do.  But they aren’t doing it for compression purposes.</p>
<p>We are really, really, really, really good at compressing structured data.  I’ll save the details for another post, but we’ve never been left worse off in a fight because of our compression numbers.  Big talk for sure, but the origin of the swagger stems partly from the compromises we make around immutability.  If your data isn’t going to change, then you can be smarter about how you order it when you persist it to disk.  We have heuristics that kick in during data appends, that optimize for compression in an adaptive way.  If you place values in columns that are similar, either lexicographically or by some other appropriate metric, then the scope for compression increases.  Pre- and post-fix string encoding become helpful, as do value differencing and minimal bit- and byte-representations.</p>
<p>When we persist blocks of newly appended compressed records as a large file, we are also careful about how we lay this file out on disk.  Remember that bold claim I made about working well on any storage platform?  Well, we try to be sequential read-friendly, and most storage technologies will excel at sequential reads.  So we are as happy on cheap 7.2K RPM near-line SAS drives as we are on the tier 1 stuff.  We don’t support updates, so we don’t have to worry about random reads and writes.  Nice.</p>
<p>Finally, I want to talk about the compromises we’ve made in our query system.  We’re designed for read-oriented query workloads, so we’ve focused on supporting the read-only features of the SQL92 standard.  No INSERTs and no UPDATEs.  We do, however, support record-level delete, which we implement in a logical way so as not to break our strict immutability model.</p>
<p>We built our query system from scratch.  There’s no PostgreSQL or MySQL heritage here, because we wanted our query engine to be scalable and index-free, and to fit in with our large block approach, our immutable data model, and the data structures that underpin our compression technology.   Our query engine takes advantage of the way we pre-order data in columns, and only store unique values.  Both of these features lead to faster projection, selection, sort, partition and join operations in RainStor.</p>
<p>Watch this space for more of the detail.  For now, I’ll leave you by saying:  yes, generalize, specialize or compromise if you like.  Just be the best damn database you can be.</p>
<p>At RainStor, we like to think we’re a bit special.</p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/choosing-data-store-generalize-specialize-compromise/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hadoop’s Growing Pains</title>
		<link>http://rainstor.com/hadoops-growing-pains/</link>
		<comments>http://rainstor.com/hadoops-growing-pains/#comments</comments>
		<pubDate>Wed, 27 Mar 2013 03:25:13 +0000</pubDate>
		<dc:creator>rainstor2013</dc:creator>
				<category><![CDATA[Blog Entries]]></category>

		<guid isPermaLink="false">http://rainstor.com/?p=3735</guid>
		<description><![CDATA[by Deirdre Mahon, VP Marketing, RainStor If you want to learn what data scientists and developers are thinking about Big Data, there’s probably no better venue than at the O’Reilly Strata conference, where the big thinkers tend to congregate. The results of a survey conducted by Dimensional Research and sponsored by RainStor are enlightening, showing [...]]]></description>
				<content:encoded><![CDATA[<p><b>by Deirdre Mahon, VP Marketing, RainStor<br />
</b></p>
<p>If you want to learn what data scientists and developers are thinking about Big Data, there’s probably no better venue than at the <a href="http://strataconf.com/strata2013/public/content/home">O’Reilly Strata</a> conference, where the big thinkers tend to congregate. The results of <a href="http://rainstor.com/big-data-survey-hadoop-projects-planning-pilot-stages/" target="_blank">a survey</a> conducted by <a href="http://www.dimensionalresearch.com/">Dimensional Research</a> and sponsored by <a href="http://rainstor.com/">RainStor</a> are enlightening, showing just how challenging working with Hadoop can be, even for experts in the field.</p>
<p>Roughly 88% of people surveyed at the conference say they are experiencing challenges with Hadoop, some of the top reasons include; lack of real-time response and requires manual coding which results in longer time to production. “People are excited about Hadoop but even the smartest people in this field are still figuring it out,” says Diane Hagglund, a senior research analyst with Dimensional Research.  <a href="http://www.datanami.com/datanami/2012-07-06/ford_looks_to_hadoop_innovative_analytics.html">Read about how Ford is experimenting with Hadoop.</a> Extensive training in order to roll-out Hadoop adds to the cost of deployment and finding and hiring people with the right skills poses an even greater challenge. Companies are not yet seeing rapid results from their Big Data projects, and more importantly some still struggle with exactly what the business wants and the problems they are trying to solve. Business users need to focus on a clear definition of the problem, up front, which will drive the technology requirements as opposed to just investing in a Hadoop cluster, throwing some technical resources at it to see what’s possible.  Attending the <a href="http://event.gigaom.com/structuredata/">GigaOM Structure:Data</a> conference in New York this past week, we heard very similar comments.  The majority of attendees are definitely working on Hadoop but a true ROI from a business standpoint has yet to be realized much less a full understanding of how the Hadoop environment will co-exist alongside other enterprise data management and analytics platforms.  It often feels as though data scientists need to figure out what’s possible with Hadoop-based technologies before fully involving the business user and investing more heavily.</p>
<p>When it comes to data volumes, however, most companies are not yet talking about petabyte-scale – at least not yet. Less than half of Hadoop projects in production manage more than 500 TB of data, according to the survey. “This confirms that many are still in the pilot stage, and at that point, it’s to be expected that you would manage a lower data volume,” says Hagglund.</p>
<p>On a positive note, the fact that 24% of respondents reported having a project with Hadoop in production is a very good sign in the still nascent “market” of Hadoop-based environments, Hagglund says. Companies are choosing Hadoop because of the belief that open source is much more affordable. More than half of participant’s say that the primary reason for using Hadoop is for low cost scale, followed by better analysis of broader data sets and unique functionality.</p>
<p>Hadoop is still a young technology &#8211; it’s clear that many organizations need more resources, expertise, solutions and tools to ease the implementation challenges. Each week we see new market entrants, which are speeding up the rate of Hadoop adoption. In fact, different verticals are adding their own unique set of tools that satisfy requirements such as built-in security and regulatory compliance capabilities. I believe the phase of Hadoop-experimentation is drawing to a close, now we’re entering a stage of rapid adoption and even slightly beyond the early adopter phase, because companies are creating best practices, seeking standardization and ease-of-use so that users can efficiently gain insights at a faster pace.</p>
<p><strong>Related Articles:</strong></p>
<p>WSJ-CIO Journal<br />
<b><a href="http://mkto-q0047.com/track?type=click&amp;enid=bWFpbGluZ2lkPXJhaW5zdG9yQmV0YWN1c3QtLS0tMTE2Ni1wcm9kLTExNiZtZXNzYWdlaWQ9MCZkYXRhYmFzZWlkPTExNiZzZXJpYWw9MTI0MzEwMzMwMiZlbWFpbGlkPWpvLm1jZG91Z2FsZEByYWluc3Rvci5jb20mdXNlcmlkPTAmZXh0cmE9JiYm&amp;&amp;&amp;http://blogs.wsj.com/cio/2013/03/22/hadoop-excitement-hits-11-implementation-still-low/?mkt_tok=3RkMMJWWfF9wsRow5%2FmYJoDpwmWGd5mht7VzDtPj1OY6hBkvJLqJK1TtuMFUGpsqOPufCAwDB59z">Hadoop Excitement Hits 11, Implementation Still Low</a></b></p>
<p>Big Data Storage and Management Report<br />
Pete Goldin, DSM Report<br />
<a href="http://mkto-q0047.com/track?type=click&amp;enid=bWFpbGluZ2lkPXJhaW5zdG9yQmV0YWN1c3QtLS0tMTE2Ni1wcm9kLTExNiZtZXNzYWdlaWQ9MCZkYXRhYmFzZWlkPTExNiZzZXJpYWw9MTI0MzEwMzMwMiZlbWFpbGlkPWpvLm1jZG91Z2FsZEByYWluc3Rvci5jb20mdXNlcmlkPTAmZXh0cmE9JiYm&amp;&amp;&amp;http://datastoragereport.com/big-data-survey-says-half-of-hadoop-projects-are-still-in-planning-or-pilot-stages?mkt_tok=3RkMMJWWfF9wsRow5%2FmYJoDpwmWGd5mht7VzDtPj1OY6hBkvJLqJK1TtuMFUGpsqOPufCAwDB59z"><b>Big Data Survey Says Half of Hadoop Projects Are Still in Planning or Pilot Stages</b></a></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://rainstor.com/hadoops-growing-pains/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
