programming4us
programming4us
DATABASE

SQL Server 2012 : Native File Streaming - Traditional BLOB Strategies, Introducing FILESTREAM

- How To Install Windows Server 2012 On VirtualBox
- How To Bypass Torrent Connection Blocking By Your ISP
- How To Install Actual Facebook App On Kindle Fire
1/20/2015 7:25:50 PM

Traditional BLOB Strategies

Prior to SQL Server 2008, there were two traditional solutions for combining structured table data with unstructured BLOB data: either keep BLOBs in the database (with all your structured table data), or store them outside the database (in the file system). In the former case, BLOBs are stored right inside the database. In the latter case, the database merely holds references (or, possibly, references are derived from other values in the database) that point to locations in the file system where the BLOBs actually live. Each of these strategies has pros and cons with respect to storage, manageability, performance, and programming complexity that we’ll discuss—but neither of them are intrinsically native to the core database engine.

BLOBs in the Database

You can, of course, simply store BLOB data directly in the columns of your database tables. You do this by declaring a column as a varbinary(max) data type, which will allow it to store a single BLOB up to 2 gigabytes (GB) in size.

Important

You should no longer use the image data type that was used to store BLOBs prior to Microsoft SQL Server 2005. The varbinary(max) data type should be used instead of image, which has been deprecated and may be removed in a future version of SQL Server.

When BLOB data is stored directly inside of tables this way, it is very tightly integrated with the database. It’s easy to access BLOB data, because the BLOB is right there in the table’s column. Because everything is contained in a single database unit, management is also simplified. Backup, restore, detach, copy, and attach operations on the database include structured and BLOB data together. Transactional consistency is another important benefit that you enjoy with this approach. Because BLOB data is a physical part of the tables in the database, it is eligible to participate in transactions. If you begin a transaction, update some data, and then roll back the transaction, any BLOB data that was updated is also rolled back. Overall, the mixture of structured and BLOB data is handled very seamlessly with this model.

Despite all these advantages, however, physically storing BLOBs in the database is practical only for small-scale applications having very few/small BLOBs. Because BLOB content is stored in-line with structured data, you can severely impair scalability by bloating your filegroups. Query performance will degrade rapidly as a result, because the query processor must sift through much larger amounts of data in your tables that are consumed with BLOB content. The BLOBs also don’t stream nearly as efficiently when backed by SQL Server varbinary(max) columns as they would if they were held externally in the file system or on a dedicated BLOB store. Finally, there is also a 2 GB limit on the varbinary(max) data type.

Tip

If you have modest BLOB requirements (that is, you are dealing with very few or very small BLOBs) you should store them in the database using the varbinary(max) data type instead of using the file system (either directly, or via FILESTREAM). Furthermore, you should consider caching small, frequently accessed BLOBs rather than repeatedly retrieving them from the database or the file system.

BLOBs in the File System

To counter these concerns, you can store BLOBs outside the database and in the file system instead. With this approach, the structured data in your relational tables merely contains path information to the unstructured BLOB data, which is held externally as ordinary files in the file system (alternatively, path information can be derived from other structured data in a row). Applications use this path information as a reference for locating and tracking the BLOB content associated with rows in the database tables. Because they are physically held in the file system, a BLOB’s size is limited only by the host file system and available disk space. This approach also delivers much better streaming performance, because the file system is a native environment that’s highly optimized for streaming. And because the physical database is much smaller without the BLOBs inside it, the query processor can continue to deliver optimal performance.

Although physically separating structured and unstructured content this way does address the performance concerns of BLOBs, it also raises new issues because the data is now separated not only physically, but logically as well. SQL Server has absolutely no awareness of the association between data in the database and files stored externally in the file system. Their coupling exists solely at the application level. Backup, restore, detach, copy, and attach operations on the database files therefore include only structured table data without any of the BLOB data that’s in the file system. You won’t get complete backups, unless you back up the file system as well, so now it’s another administrative burden to separately manage the file system.

Application development against this model is also more complex because of the extra effort required for linking between the database and the file system. It’s up to the developer to establish and maintain the references between structured data and external BLOBs on their own, and according to their own custom design. Last, although perhaps most significant, there is no unified transactional control across both the database and the file system. Naturally, rolling back a database transaction won’t undo changes you’ve made in the file system.

Introducing FILESTREAM

Both of the traditional BLOB solutions present tough challenges, so what do you do? With FILESTREAM, SQL Server offers a way out of this conundrum. First, make sure you understand that FILESTREAM is technically not a SQL Server data type. Rather, it is implemented as an attribute that you apply to the varbinary(max) data type—the same data type traditionally used to store BLOBs directly inside structured tables. However, merely applying this attribute tells SQL Server to store the BLOB in the file system rather than the table’s structured filegroup. With the FILESTREAM attribute applied, you continue to treat the varbinary(max) column as though its contents were stored in-line with your structured table data. Under the covers, however, SQL Server stores and maintains the data in the server’s local NTFS file system, separately from the structured content of the database that remains in the normal filegroups.

With FILESTREAM, structured and unstructured data are logically connected while physically separated. The unstructured data is configured as a special filegroup in the database, so it’s actually considered part of the database—it is available in all logical database operations, including queries, transactions, and backup/restore. On disk, however, the BLOBs are stored as individual physical files in the NTFS file system that are created and managed automatically behind the scenes. SQL Server establishes and maintains the link references between the structured file groups and the file system. It knows about the unstructured BLOB data in the file system and considers the files holding BLOB data to be an integral part of the overall database. But the unstructured data does not impede query performance because it is not physically stored in-line with table data. It’s stored in the file system, which is a native BLOB environment (and where it ostensibly belongs). Logically, however, the database encompasses both the relational tables and the individual BLOB files in the file system. You therefore continue to treat BLOB data as though you were storing it in-line, from both a development and an administrative perspective.

Tip

Backing up the database includes all the BLOB data from the file system in the backup automatically. However, because the BLOB data is contained in its own database filegroup, you can easily exclude it from backups if desired or as needed.

The end result is that SQL Server uses the appropriate mechanism for structured and unstructured data—storing relational (structured) data in tables and BLOB (unstructured) data in ordinary files—so it can deliver the best possible performance all around. Because it does this completely transparently, you enjoy integrated management benefits over the database. You also enjoy simplified application development as you are no longer burdened with the additional complexities of manually associating the database with the file system and keeping the two in sync. Last, by leveraging the transactional capabilities of the NTFS file system, BLOB updates participate seamlessly with database transactions. If you’re starting to get excited by all this, that’s the idea! You’re now ready to dive in to some real code that puts FILESTREAM to work for you.

Other  
  •  SQL Server 2012 : Consolidating Data Capture with SQLdiag - Getting Friendly with SQLdiag (part 2) - Using SQLdiag as a Service
  •  SQL Server 2012 : Consolidating Data Capture with SQLdiag - Getting Friendly with SQLdiag (part 1) - Using SQLdiag as a Command-line Application
  •  SQL Server 2012 : Consolidating Data Capture with SQLdiag - The Data Collection Dilemma, An Approach to Data Collection
  •  SQL Server 2012 : Troubleshooting Methodology and Practices - Data Analysis, Validating and Implementing Resolution
  •  SQL Server 2012 : Troubleshooting Methodology and Practices - Data Collection
  •  SQL Server 2012 : Troubleshooting Methodology and Practices - Defining the Problem
  •  SQL Server 2012 : Troubleshooting Methodology and Practices - Approaching Problems
  •  Two Is Better Than One - WD My Cloud Mirror
  •  Data In A Pinch - Kingston Mobile Lite Wireless G2
  •  SQL Server 2012 : Policy Based Management - Evaluating Policies
  •  
    Top 10
    - Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 2) - Wireframes,Legends
    - Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 1) - Swimlanes
    - Microsoft Visio 2013 : Adding Structure to Your Diagrams - Formatting and sizing lists
    - Microsoft Visio 2013 : Adding Structure to Your Diagrams - Adding shapes to lists
    - Microsoft Visio 2013 : Adding Structure to Your Diagrams - Sizing containers
    - Microsoft Access 2010 : Control Properties and Why to Use Them (part 3) - The Other Properties of a Control
    - Microsoft Access 2010 : Control Properties and Why to Use Them (part 2) - The Data Properties of a Control
    - Microsoft Access 2010 : Control Properties and Why to Use Them (part 1) - The Format Properties of a Control
    - Microsoft Access 2010 : Form Properties and Why Should You Use Them - Working with the Properties Window
    - Microsoft Visio 2013 : Using the Organization Chart Wizard with new data
    REVIEW
    - First look: Apple Watch

    - 3 Tips for Maintaining Your Cell Phone Battery (part 1)

    - 3 Tips for Maintaining Your Cell Phone Battery (part 2)
    programming4us programming4us
    programming4us
     
     
    programming4us