Welcome to HBH! If you have tried to register and didn't get a verification email, please using the following link to resend the verification email.

Steganography - Find the hidden data again


Napoleon's Avatar
Member
0 0

Hello, I wrote this little program that can hide files in images. Currently the first few bits are used to write the size of the hidden file in bits. And after that the LSB of the RGB values are overwritten with a 0 or a 1.

Aren't there any better solutions? I mean, I read something about spreading the data over the whole image so we don't need to store how many bits the hidden file is.

http://www.garykessler.net/library/steganography.html Here for example they hide 9 bits by overwritting only 4 and there is nothing written to say that we changed 4 bits… How does this work when reverting it back to a file/string/etc?

Thanks.


ghost's Avatar
0 0

If I'm not mistaken (not really big on crypto/encryption), you don't have to worry about file size at all. All you need is the original "pre-stego'd" image and the stego'd image. XOR them and which ever bits are 1's represent changed bits.

10010101 00001101 11001001 10010110 00001111 11001010 10011111 00010000 11001011

XOR

10010101 00001100 11001001 10010111 00001110 11001011 10011111 00010000 11001011

Yields

00000000 00000001 00000000 00000001 00000001 00000001 00000000 00000000 00000000

We know which bits have been changed and which ones were left untouched. Also, given there are only two distinct values for each bit. If we know which bits were changed, we know how to obtain the original values of those bits. I don't quite understand what you mean though when you ask how we convert to strings.


Napoleon's Avatar
Member
0 0

@ Moshbat: I don't have the knowledge (yet) to reverse the main page's banner. I suppose it holds a hidden message or file using steganography.

@ Pwnzall: Currently I reserve the first sizeof(Int32)x8 amount of bits in the RGB LSB's to store the size of the hidden message/file in. The next RGB LSB's values after that contains the actual bits of the hidden message/file.

Now what I mean with "convert to strings" is: After the image with the hidden message/file is writting to the disk I want to retrieve that message/file again. That's what I meant with "convert to strings".

With my current method I'll first read the first bits (total of sizeof(Int32)x8) and store that value in a variable called hiddenDataSizeInBits. Then I process the next RGB LSB's values in a for loop that loops from 0 to hiddenDataSizeInBits-1. Then I output the hidden message/file. It works but it has no compression and it uses a sort of custom header to determine where where the end of the hidden data is. I'm sorry I'm probably very unclear but I don't know how to properly explain it. But this way I don't need the original image as a key to retrieve the data.

That XOR method seems good. I'm gonna implement that one too I think. But still there must be something like spreading the data over the image or compressing it somehow.

I read somewhere about IEEE steganography and ACM and other (very complicated) variants. Isn't there a list here somewhere on HBH with some good links, references or documents with steganography variants? They are so hard to find on Google and I could not find a search button here on the HBH forums so I use google to search the forums here -.-.

[Edit] P.S.

After trying that XOR-method I encountered the following problem:

[hidden message] 111

[original image] 10010101 00001101 11001001 10010110 00001111 11001010 10011111 00010000 11001011

[new stegano-ed image] 10010101 00001101 11001001 10010110 00001111 11001010 10011111 00010000 11001011

yields 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

How do I know the length of the hidden image now?

I came up with the following (dirty?) solution: [hidden message] 111 (+1, we write one extra bit that has the opposite value of the original image's next LSB )

[original image] 10010101 00001101 11001001 10010110 00001111 11001010 10011111 00010000 11001011

[new stegano-ed image] 10010101 00001101 11001001 10010110 00001111 11001010 10011111 00010000 11001011

yields 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000 00000000

Now I can simply delete all 00000000-bytes at the end plus the 00000001 byte and then I got my encoded message/file. But isn't there a better/faster solution?

I Swear, Stegano problems are much more fun than Sudoku's!


ghost's Avatar
0 0

My crude hackaround

  1. Include a header, though instead of changing the LSB, forcibly NOT a bit toward the front or middle of the beginning byte
  2. Include the payload
  3. Include a footer maybe NOT'ing the bits in the middle if you cnanged them in the front or vice-versa
  4. Now, when you XOR the stego'd vs the original, you are guaranteed not to end up with the dilemma you had prior. You will always have a reference point
  5. Make sure the #bits are divisible by 8, and pad if necessary (although I think the comp will pad it for you, there is the chance it will cause something like a frame shift which could be disastrous)
  6. Discard the header/footer which just leaves you with raw data
  7. Write the raw data (in binary mode!!!!!) to a file with the extension you want (i.e., in python data = extract(stego.jpg, original.jpg) f = open(lol.doc, 'wb') f.write(data))

Hopefully, if this post wasn't of much help, somewhere like stackoverflow.com would be.


Napoleon's Avatar
Member
0 0

Thanks. I also figured out the above solution but I was hoping for THE elegant mathematical solution that doesn't require any extra bits. But it seems that none exists.

I handle everything at the bit level and therefor I'm not worried about some of the problems. I had to write my own class to handle it tough as usually everything is done in bytes. I just always need to know the beginning and the end. To make it even harder to crack I can skip bits or apply other annoying algorithms to it so that the order of the bits is not the actual order of the original file. Try to crack that :). The user however must remember what algorithm was used in order to extract the data as I don't include a header for that for security reasons. Oh and the bits themselves are optionally encrypted using AES-256 with a custom user-defined password. It took me some time to figure out how to encrypt something at the bit level.

I was also thinking about using 2 carrier images instead of one. Then the hidden file is stored in image A and image A is stored into image B. But oh well… The above is near unbreakable anyway. I just wasn't happy using header and/or footers and such.

And of course I always only change the LSB as that it's just a must. It's the basic of digital Stegano with images.

So the extraction process is now like: 1: read the carrier image in bytes 2: On of the below solutions: 2a: read the header to get the filesize. 2b: use the XOR method to get the changed bits 3: get the LSB bits in bit-format in order of appearance. 4: optional: apply the algorithms on the bits to get the correct order of the bits. 5: optional: decode it using AES-256 and the appropriate salts, passwords, etc. 6: convert the bits to bytes. 7: write the bytes to a file.

Crap I talk too much :p