Hi,
you can use JPEG itself! Here is my code, but I make no efforts to optimize it thus it can be really slow. First of all, you need define RGB to 
YCbCr transform. I found very nice code 
here via Mr.Wizard (up to some small correction):
 With[{v1 = Developer`ToPackedArray@{16`, 128`, 128`}, 
   m1 = Developer`ToPackedArray@{{65.481, 128.553, 
       24.966}, {-37.797, -74.203, 112`}, {112`, -93.786, -18.214}}},
  RGBtoYCbCr[a : {_, _, _}?VectorQ] := v1 + m1.a;
  RGBtoYCbCr[img_Image] := RGBtoYCbCr@ImageData[img(*,"Byte"*)];
  RGBtoYCbCr[a_ /; MatchQ[Dimensions@a, {_, 3}]] := 
   Transpose[v1 + m1.Transpose[a]];
  RGBtoYCbCr[a_ /; MatchQ[Dimensions@a, {_, _, 3}]] := 
   Transpose[v1 + m1.Transpose[a, {2, 3, 1}], {3, 1, 2}];]
Inverse transform:
 With[{v1 = Developer`ToPackedArray@{-16`, -128`, -128`}, 
   m1 = Developer`ToPackedArray@
     Inverse@{{65.481, 128.553, 24.966}, {-37.797, -74.203, 
        112`}, {112`, -93.786, -18.214}}},
  YCbCrtoRGB[a : {_, _, _}?VectorQ] := m1.(a + v1);
  YCbCrtoRGB[a_ /; MatchQ[Dimensions@a, {_, 3}]] := 
   Transpose[m1.(Transpose[a] + v1)];
  YCbCrtoRGB[a_ /; MatchQ[Dimensions@a, {_, _, 3}]] := 
   Transpose[m1.(Transpose[a, {2, 3, 1}] + v1), {3, 1, 2}];]
Then it is necessary to define the Fourier discrete cosine transform.
fnorm = Developer`ToPackedArray@
   Normal@SparseArray[{{1, 1} -> 1, {i_, i_} -> Sqrt[2]}, {8, 8}];
bnorm = Developer`ToPackedArray@
   Normal@SparseArray[{{1, 1} -> 1, {i_, i_} -> 1/Sqrt[2]}, {8, 8}];
ForwardDCT[matrix_] := fnorm.FourierDCT[matrix - 128, 2].fnorm;
BackwardDCT[matrix_] := 128 + FourierDCT[bnorm.matrix.bnorm, 3];
Please, notice the JPEG uses DCT in slightly different way than Wolfram Mathematica. Next step is to define the quntization tabels for all channels:
 DQT["luma"] = 
   Developer`ToPackedArray@{{16, 11, 10, 16, 24, 40, 51, 61}, {12, 12, 
      14, 19, 26, 58, 60, 55}, {14, 13, 16, 24, 40, 57, 69, 56}, {14, 
      17, 22, 29, 51, 87, 80, 62}, {18, 22, 37, 56, 68, 109, 103, 
      77}, {24, 35, 55, 64, 81, 104, 113, 92}, {49, 64, 78, 87, 103, 
      121, 120, 101}, {72, 92, 95, 98, 112, 100, 103, 99}};
 
 DQT["chroma"] = 
   Developer`ToPackedArray@{{17, 18, 24, 47, 99, 99, 99, 99}, {18, 21, 
     26, 66, 99, 99, 99, 99}, {24, 26, 56, 99, 99, 99, 99, 99}, {47, 
     66, 99, 99, 99, 99, 99, 99}, {99, 99, 99, 99, 99, 99, 99, 
     99}, {99, 99, 99, 99, 99, 99, 99, 99}, {99, 99, 99, 99, 99, 99, 
     99, 99}, {99, 99, 99, 99, 99, 99, 99, 99}};
The last step is the quntization itself:
Quantization[channel_, block_, quality_] := 
  BackwardDCT[
   quality DQT[channel] Round[
     ForwardDCT[block]/(quality DQT[channel])]];
Now, we can start. Import an image and divide it into 8*8 blocks:
horse = Import["http://i.stack.imgur.com/ZhYwB.jpg"]
blocks = Partition[#, {8, 8}] & /@ 
   Transpose[RGBtoYCbCr@horse, {2, 3, 1}];

Then we need to choose the quality of the output.
quality = 2;
Notice that the quality should be large than 1 and quality=1 reduces the size of the output file by factor 1.8. Thus we put a really terrible quality. The next step is quantization of each channel:
 luminance[1] = 
   Join @@ (Join @@@ Transpose[#, {2, 1, 3}] & /@ 
      Map[Quantization["luma", #, quality] &, blocks[[1]], {2}]);
 chrominance[1] = 
   Join @@ (Join @@@ Transpose[#, {2, 1, 3}] & /@ 
      Map[Quantization["chroma", #, quality] &, blocks[[2]], {2}]);
 chrominance[2] = 
   Join @@ (Join @@@ Transpose[#, {2, 1, 3}] & /@ 
      Map[Quantization["chroma", #, quality] &, blocks[[3]], {2}]);
This part works really slow, hence please try to optimize it. The last step is to convert to RGB and save to disk:
SetDirectory[NotebookDirectory[]];
Export["horse.jpg", 
 Image[YCbCrtoRGB@
   Transpose[{luminance[1], chrominance[1], chrominance[2]}, {3, 1, 
     2}]]]
The size of the output file is 2.2 times smaler then initial one. The Mathematica file is attached to this post. 
 
				
					
				
				
					
					
						
							 Attachments:
							Attachments: