Big Data & Storage (Part 2)

12/13/2012 3:10:24 PM

Dealing in data

Using responses from dozens of IT personnel, Aberdeen Group identi­fies three storage tools to help manage data growth, with each addressing a different challenge associated with storing big data. These include storage virtualization (manages various data types), data compression or dedu­plication (managed the size of data), and data tiering (manages the speed at which data changes). "Using these three storage features, organizations can reduce the financial impact of big data management," according to Aberdeen's report.

Of the three, Csaplar says storage virtualization is probably interesting to SMEs because they likely only have one storage device or multiple devices from the same vendor. Data deduplication and storage tiering, however, "are capabilities available today from multiple vendors that can be set and forgotten," he says.

Dealing in data

In its "The Economic Benefit of Storage Efficiency Technologies" study, IDC reported that storage end users that currently deploy or plan to deploy storage and/or data efficiency technologies in data centers typically adopt technologies in stages, begin­ning with compression and dedupli­cation, then storage virtualization, tiering, thin provisioning, and repli­cation. IDC identified data compres­sion and storage virtualization as the two most adopted storage efficiency technologies and data deduplication as the most desirable for near-future implementation.

In preparing for and managing big data-level storage requirements, Quinn says existing tools SMEs use to predict storage requirements "typically suffice for big data ap­plications." If acquiring a big data appliance, "the levels are largely predetermined," though he advises ensuring "that storage and CPU up­grade paths are clear and pricing transparent expect that the up­grades will be required." ESG, Quinn says, believes tiered storage and thin provisioning will prove key to on­going big data operations.

Robosson says organizations of all sizes are assessing methods and tools to improve real-time delivery of big data analytics. "Traditional SANs may in some cases be insufficient, giving rise to SSD [solid-state drive] and DAS [direct-attached storage] as higher­performance options purely from a device perspective," he says. Once SMEs determine a big data strategy, he says, new data storage appliances offer good options.

Where backup and DR (disaster re­covery) requirements related to big data are concerned, Robosson says, "as the data model expands to include nontraditional data types and sources, so will disaster recovery protocols undergo scope changes." Companies will need to determine how critical new data types are to their opera­tions, he says. "In other words, there may be cases were conventional DR rules might not apply to non-trans­actional data (images, video, social media, spatial, etc.) vs. relational data that businesses rely on for accounting, customer service, and routine opera­tions," he says.

Is Virtualization Necessary for SME's?

Is Virtualization Necessary for SME's?

For initial big data projects that rate as "experimental" or "first pass dis­covery," Quinn says disaster recovery and data protection requirements may lag behind what companies currently require at an existing operational data store/data warehouse. As big data becomes essential to critical decision­-making, however, the "big data facility and/or apps will rank among the top tier of apps, like transactional appli­cations," he says. "That implies that eventually DR and data protection will be extremely important. Our surveys . . . suggest that after availability, data archiving and data migration are the next most important infrastructural functions in support of big data."

Woo explains that some companies will view data related to big data as derivative, meaning if it's lost, it can be regenerated. Others will view it as archival, "which means that backups should have already been taken of the data," Woo says. "More progres­sive and aggressive companies may need to update backup procedures to accommodate an increased amount of data retention, and therefore upgrade backup infrastructure. Tape is actu­ally very good for this purpose."

Where cloud computing and storing big data type data volumes is concerned, Csaplar sees the cloud as an ultimate storage tier vs. primary storage site. "Clouds are remote, and therefore latency is introduced if you try and use cloud for just storage. I would think of clouds for archiving, backup, and recovery or long term storage options," he says. Woo says much data now used for big data analysis is actually cloud-based data that includes social media posts. "In these cases, since the data is al­ready in the cloud, it's best left in the cloud," he says.

Many vendor storage appliances are easily deployable in a private cloud, Robosson says. SMEs must evaluate the investment and associ­ated benefits of cloud-based access, he says. Public cloud providers are also a viable option for many SMEs that can't incur the upfront invest­ment, he says. Quinn, meanwhile, says companies can use the cloud "as the overflow or temporary data bucket for peaks, projects, and spe­cial but temporary or less important information-management needs as­sociated with big data."

How to proceed

The approach SMEs should take to prepare for a big data initiative can vary. Robosson advises companies thoroughly assess their data model, data types, volumes, and potential business intelligence when devel­oping a big data strategy and asso­ciated storage game plan. The SME should then complete a proof of con­cept to validate a vendor solution it's considering before proceeding with implementation.

The approach SMEs should take to prepare for a big data initiative can vary

The approach SMEs should take to prepare for a big data initiative can vary

Due to the complexity and cost involved, Quinn says SMEs should consider smaller platform options for big data. He suggests choosing one in which vendors offer "cloud and/or prepackaged (with partners) infrastructures and a decent array of integration tools, analytics func­tions, and rich visualization." He adds that the larger big data sup­pliers are often "designed to better serve the larger data volumes and broader complexity of requirements of the Global 2000."

Positively, Quinn says most best practices that businesses have al­ready learned apply to big data. "Understanding the importance and periodicity of the data flows for big data are essential, and remember the data flows aren't just for data in­gest but also for feeding data poten­tially to visualization tools. You may have new integration flows as well, and those may require fresh storage and information management." After a business attunes itself to data, Quinn says, it can use tiered storage approaches.

Cost-wise, Woo believes costs should be secondary to the business purpose for which the company is deploying big data. That said, Woo says, costs may not generally be that high, as companies can use industry-standard servers along with open source software. "The real cost really comes in people," he says.

Robosson suggests that SMEs budget for an initial analysis to help map out an overall game plan and plan for IT training, storage appli­ance acquisition, and implementa­tion services. Quinn, meanwhile, advises companies and IT managers work jointly to create a complete cost estimate for big data, "not just storage, not just infrastructure, but software, training, additional per­sonnel, etc." Overall, he says, expect costs similar to that of existing data warehousing and business intelli­gence solutions, "but most impor­tantly remember that you're starting new, meaning you will have capital outlays and/or increases in subscrip­tion outlays."

Key points

Data is generated at unprecedented levels, making processing and analyzing information difficult using traditional databases and software.

Many big data vendors are including infrastructure and software in their appliances to alleviate pressure and guesswork facing SMEs con­sidering big data initiatives.

When developing a big data strategy, an SME should thoroughly assess its data model, data types, volumes, and potential business intelligence.

Storage end users that currently deploy or plan to deploy storage and/or data efficiency technologies in data centers typically adopt technologies in stages.

Most View
Microsoft SharePoint 2010 Web Applications : Presentation Layer Overview - Ribbon (part 1)
The Cyber-athletic Revolution – E-sports’ Era (Part 1)
Windows Server 2003 : Implementing Software Restriction Policies (part 4) - Implementing Software Restriction Policies - Creating a Path Rule, Designating File Types
Sql Server 2012 : Hierarchical Data and the Relational Database - Populating the Hierarchy (part 1)
Two Is Better Than One - WD My Cloud Mirror
Programming ASP.NET 3.5 : Data Source-Based Data Binding (part 3) - List Controls
Windows 8 : Configuring networking (part 5) - Managing network settings - Understanding the dual TCP/IP stack in Windows 8, Configuring name resolution
Nikon Coolpix A – An Appealing Camera For Sharp Images (Part 2)
Canon PowerShot SX240 HS - A Powerful Perfection
LG Intuition Review - Skirts The Line Between Smartphone And Tablet (Part 2)
Popular Tags
Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Biztalk Exchange Server Microsoft LynC Server Microsoft Dynamic Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Indesign Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe After Effects Adobe Photoshop Adobe Fireworks Adobe Flash Catalyst Corel Painter X CorelDRAW X5 CorelDraw 10 QuarkXPress 8 windows Phone 7 windows Phone 8 BlackBerry Android Ipad Iphone iOS
Top 10
Review : Acer Aspire R13
Review : Microsoft Lumia 535
Review : Olympus OM-D E-M5 Mark II
TomTom Runner + MultiSport Cardio
Timex Ironman Run Trainer 2.0
Suunto Ambit3 Peak Sapphire HR
Polar M400
Garmin Forerunner 920XT
Sharepoint 2013 : Content Model and Managed Metadata - Publishing, Un-publishing, and Republishing
Sharepoint 2013 : Content Model and Managed Metadata - Content Type Hubs