Detect Image File Types Through Byte Arrays

March 07, 2017

      Have you ever needed to know the file type of an image?  Did you know that certain image file types when read as a byte array have the same collection of bytes every time?  Instead of checking for an image type by looking for the extension in the file name it’s best to look at the byte array making up the file and compare it to known image type byte arrays.  

I was working on a project where I needed to download images from a url and store them on a file system.  The downloaded images would change daily and could be any number of file types.  In fact, I didn’t even know if the file I was downloading was actually going to be an image.  I only needed images, but it was possible that the web call could send back something else like a web page or different file type all together.  I only wanted to save the file if it was an image type and if the image type was one of the common ones that I expected. (.png, .jpeg, .bmp, etc)  After looking online for a solution to detect an image file type I came across a very informative post on Stack Overflow where someone else was trying to do the same thing.  The answer that made the most sense to me was pointed out that a lot of image types have specific sets of bytes at the beginning of the file type to denote which type of file it is.

For example, in C#:

var bmp = Encoding.ASCII.GetBytes("BM"); // BMP 
var gif = Encoding.ASCII.GetBytes("GIF"); // GIF 
var png = new byte[] { 137, 80, 78, 71 }; // PNG 
var tiff = new byte[] { 73, 73, 42 }; // TIFF 
var tiff2 = new byte[] { 77, 77, 42 }; // TIFF 
var jpeg = new byte[] { 255, 216, 255, 224 }; // jpeg 
var jpeg2 = new byte[] { 255, 216, 255, 225 }; // jpeg canon

The post referenced pretty much all of the image file types that I wanted to use.   Anything jpeg, png, gif, or bmp was exactly what I was expecting to see.  This allowed me to pass in a stream from the url and compare it to any of these byte arrays and detect the file type.  This also enabled me to download and save the files locally with the correct extension.  In the post, KevanTTT’s answer was based off another post’s answer but KevanTTT modified the solution to use a stream rather than only byte arrays.   It isn’t a huge change but I thought I should credit both posts.  After knowing that each image file type’s initial bytes are the same per file type it makes it easy to work with any number of image file types.  

This information is not life changing information but it makes sense as all file types are denoted in some way or another and probably by more than just writing a file extension at the end of a file name.  I’m happy that I came across it and I hope it benefits you in the future!

If you enjoy this topic or enjoy talking about development of any kind you should check out our available positions at Sparkhound.  Sparkhound is full of people with aligned interests and motivation to provide the best possible solution to any scenario.  There are plenty of great minds to lean on and we like to have fun too!  Feel free to contact me at sam.north@sparkhound.com and/or contact Sparkhound for any further discussions, questions, or feedback.  Woot!

Sam

Information and material in our blog posts are provided "as is" with no warranties either expressed or implied. Each post is an individual expression of our Sparkies. Should you identify any such content that is harmful, malicious, sensitive or unnecessary, please contact marketing@sparkhound.com.

Meet Sparkhound

Review our capabilities and services, meet the leadership team, see our valued partnerships, and read about the hardware we've earned.

Learn How We Work

See how our Plan/Build/Run methodology drives real client success, and gain our team's perspectives on timely tech topics.

Engage With Us

Get in touch any of our offices, or checkout our open career positions and consider joining Sparkhound's dynamic team.