Can you break my encryption
First, hello all, this is my first post, finally, after all those challenges :P
Anyways, I have an algorithm I've created and want to see if anyone can crack it.
This is the first version, and fairly easy (I can make it much harder), also, the next version will have rotating keys and be harder to crack.
So, this is pretty much just a test to see where this first version's at.
Check out http://www.stardotnet.com/software/?q=node/10 for the download link
Direct link to the zip file is here : http://www.stardotnet.com/software/crypto-samples.zip
I've also kept it very easy, it's just text. The zip file has 3 file in it.
-A text file containing the first chapter of Alice in Wonderland -An encrypted file of the Alice in Wonderland file. -An encrypted file of something else.
I've given some clues by supplying a before and after as a sample..
Can you decode the 2nd encrypted file?
Good luck! B)
spyware wrote: Where's the source?
Source for what?
Like I mentioned, there are 3 files in the zip file that I linked.
1-Sample-input.txt < That is an original sample of plain text. (Source) 1-Sample-output-encrypted.txt < That is the encrypted version of the first source file. 2-Challenge-output-encrypted.txt < That is a different encrypted file, the original unencrypted version of the source is not supplied to you.
I've given you both Sample files to give you a before and after of the encryption using my method. Both SAMPLE files are from Chapter 1 of Alice in Wonderland, except that the second SAMPLE file is the encrypted version.
The challenge is in 2 parts.
PART 1) By using the two files that start with "1-Sample", you must figure out how the file has been encrypted**.
PART 2)** Try to decrypt the file "2-Challenge-output-encrypted.txt" back to it's original state using the information gathered from PART 1. Again, the original unencrypted version for this Challenge file is not included, so you must decrypt it yourself.
I've already given a hint. The original unencrypted Challenge file is only text.
I'll give you another hint : It's another chapter from a book (No, not the same book).
If you've unencrypted it, then post either the entire text, or just what chapter and book the text is from.
Again, good luck :)
Modern encryption is secure even though the source code is freely available. Cracking your files would be a pain in the ass, and pass/fail is the only feedback anyone could give you.
The measurement of encryption is roughness/randomness. So Someone would take a known plain text, encrypt it, and then compare the results with various sizes and types of data. Then they can see how your encryption works with hundreds or thousands of different file sizes/types. And you can even graph things like how effective/fast it is.
And besides, it usually is better to use one of the half-dozen or so popular and proven methods anyways. So no one is going to put in that kind of time for you.
maug2 wrote: Modern encryption is secure even though the source code is freely available. Cracking your files would be a pain in the ass, and pass/fail is the only feedback anyone could give you.
The measurement of encryption is roughness/randomness. So Someone would take a known plain text, encrypt it, and then compare the results with various sizes and types of data. Then they can see how your encryption works with hundreds or thousands of different file sizes/types. And you can even graph things like how effective/fast it is.
And besides, it usually is better to use one of the half-dozen or so popular and proven methods anyways. So no one is going to put in that kind of time for you.
Good points maug. I've considered possibly releasing the source code, however, I'm still optimizing it to speed it up at this point. I also need to finish my cycling key methods.
Also, if the source is not released, and nobody can crack it, perhaps it would have more value because the source isn't even available as a starting point to try to figure it out.
Yeah, I'm aware that the software could be decompiled, analyzed etc etc
stealth- wrote: First, just right off the bat: this is a terrible algorithm. Your output binary size is 15 times the size of the input file, and from just an initial look at the hex, quite unnecessary. I am interested in trying to crack this, though, hopefully I'll have time to take a look later today :)
Haha, yes it's terrible. Far from finished. There the repeating quad hex zero pattern that I'll be clearing out to compress it and clean it up. That will shrink it down.
Thanks for checking it out!
xtoro wrote: I've considered possibly releasing the source code, however, I'm still optimizing it to speed it up at this point. spyware wrote: Fast encryptions can be bruteforced fast. I thought he meant he was just trying to get a faster algorithm which produces the same output. Once the source code is released, anyone would be able to optimise the encryption algorithm if they wanted to try to brute force it.
The decrypted text is available at the following link:
http://www.online-literature.com/james_joyce/ulysses/1/
Ulysses by James Joyce, Episode 1.
This is a mono-alphabetic substitution cipher. Each character in the plaintext corresponds to 16-bytes in the cipher text. Essentially each character is exploded into a 16-byte value that is written into the cipher text file. Although the ciphertext is much longer than the plaintext, there still are only 256 possible ciphertext values.
A quick look at the ciphertext shows matching 16-byte values, indicating coincidental characters. Simply by generating a codebook from 16-byte values to characters from the example plain/ciphertext allows us to decrypt 99.5% of the plaintext from the challenge. The remaining unknown codebook values can simply be guessed by looking at the context.
To fully decrypt the ciphertext we need to figure out the algorithm that performs encryption decryption. Looking at consecutive ASCII characters allows us to view a pattern.
a: 9703000083020000E501000039030000 b: 9E03000089020000E80100003B030000
We can see that there is a difference between consequentive WORDs of 0x7000000 for the first WORD, 0x6000000 for the second…
This reminds me of an affine cipher. Let's assume that and see if our hypothesis matches up. An affine transformation is detailed as followed:
C(i) = a*P(i)+b
where C(i) is the ith ciphertext byte, P(i) is the ith plaintext byte, and a and b are positive constants.
After working through some examples, I think that the algorithm might be a modified affine transformation, Perhaps the formula is something more like:
C(i) = a*P(i)+(P(i)%b)*c+d
And the winner is….
bnw33 wrote: The decrypted text is available at the following link:
http://www.online-literature.com/james_joyce/ulysses/1/
Ulysses by James Joyce, Episode 1.
Awesome job dude! See? Not THAT hard, and yes, since there is no rotating key and the algorithm is quite simple, it leads to being the equivalent of a substitution cipher since the bytes always remain equal for the same characters, like mentioned :
bnw33 wrote: This is a mono-alphabetic substitution cipher. Each character in the plaintext corresponds to 16-bytes in the cipher text. Essentially each character is exploded into a 16-byte value that is written into the cipher text file. Although the ciphertext is much longer than the plaintext, there still are only 256 possible ciphertext values.
A quick look at the ciphertext shows matching 16-byte values, indicating coincidental characters. Simply by generating a codebook from 16-byte values to characters from the example plain/ciphertext allows us to decrypt 99.5% of the plaintext from the challenge. The remaining unknown codebook values can simply be guessed by looking at the context.
I'm guessing that you used frequency analysis along with this to determine what bytes correspond to what characters right? The letter E being the most common…
bnw33 wrote: To fully decrypt the ciphertext we need to figure out the algorithm that performs encryption decryption. Looking at consecutive ASCII characters allows us to view a pattern.
a: 9703000083020000E501000039030000 b: 9E03000089020000E80100003B030000
We can see that there is a difference between consequentive WORDs of 0x7000000 for the first WORD, 0x6000000 for the second…
This reminds me of an affine cipher. Let's assume that and see if our hypothesis matches up. An affine transformation is detailed as followed:
C(i) = a*P(i)+b
where C(i) is the ith ciphertext byte, P(i) is the ith plaintext byte, and a and b are positive constants.
After working through some examples, I think that the algorithm might be a modified affine transformation, Perhaps the formula is something more like:
C(i) = a*P(i)+(P(i)%b)*c+d
Very close, but no, although at this point with this version, it does very well seem like a modified affine tranform, but the formula is wrong.
My question is, did you actually decode the entire thing or enough to figure out what it was?
Good job by the way, that was pretty fast!
I'll be posting up my next algo soon, it will be more difficult ;)
Stay tuned!
xtoro wrote:
Very close, but no, although at this point with this version, it does very well seem like a modified affine tranform, but the formula is wrong.
My question is, did you actually decode the entire thing or enough to figure out what it was?
Good job by the way, that was pretty fast!
I'll be posting up my next algo soon, it will be more difficult ;)
Stay tuned!
Initially, after looking at the plaintext and seeing matching lines, I though that I would try to make a script that mapped 16-byte->characters from cipher->plaintext. I then applied this "dictionary" to the encrypted file and was able to recover a large portion of the plaintext.
After this I wanted to discover the encryption/decryption formula that converted between plaintext/ciphertext. I didn't fully discover it, although I have a better idea of what it is now. To do this I used my initial hunch of C(i) = a*P(i)+b, since successive characters seemed to be increasing by a constant.
I then isolated b and printed it out along the ascii-continuum, and again I saw that there were gaps between every 10 characters. Also I noticed that characters >= 100, the b value was increasing by a certain value every character.
So I think the formula is more like this. Using integer maths, the last term will only be non-zero when P(i) >= 100.
C(i) = P(i)*a + (P(i)/10)*b + (P(i)/100)*P(i)*c
This was a fun exercise and I look forward to your next challenge!
Apologies on the delay for my next version, but I was away on business in Italy when I posted my last challenge and had a family emergency, so I had to go home to Canada for a few weeks. Sorry to report I've had no progress since my last post, but I've taken many notes and come up with several ideas on my long flights lately. Glad to be back in Germany :)
My future versions have some significant alterations that come with my new ideas, and will add much perceived randomness to the outputs. My first challenge (this post) will still be the base of my future versions, minus all the extra zero padding…