Search Results

Search found 28 results on 2 pages for 'solomon'.

Page 2/2 | < Previous Page | 1 2 

  • YouTube Scalability Lessons

    - by Bertrand Matthelié
    @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Calibri"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }h2 { margin: 12pt 0cm 3pt; page-break-after: avoid; font-size: 14pt; font-family: "Times New Roman"; font-style: italic; }a:link, span.MsoHyperlink { color: blue; text-decoration: underline; }a:visited, span.MsoHyperlinkFollowed { color: purple; text-decoration: underline; }span.Heading2Char { font-family: Calibri; font-weight: bold; font-style: italic; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Very interesting blog post by Todd Hoff at highscalability.com presenting “7 Years of YouTube Scalability Lessons in 30 min” based on a presentation from Mike Solomon, one of the original engineers at YouTube: …. The key takeaway away of the talk for me was doing a lot with really simple tools. While many teams are moving on to more complex ecosystems, YouTube really does keep it simple. They program primarily in Python, use MySQL as their database, they’ve stuck with Apache, and even new features for such a massive site start as a very simple Python program. That doesn’t mean YouTube doesn’t do cool stuff, they do, but what makes everything work together is more a philosophy or a way of doing things than technological hocus pocus. What made YouTube into one of the world’s largest websites? Read on and see... Stats @font-face { font-family: "Arial"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; } 4 billion Views a day 60 hours of video is uploaded every minute 350+ million devices are YouTube enabled Revenue double in 2010 The number of videos has gone up 9 orders of magnitude and the number of developers has only gone up two orders of magnitude. 1 million lines of Python code Stack @font-face { font-family: "Arial"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; } Python - most of the lines of code for YouTube are still in Python. Everytime you watch a YouTube video you are executing a bunch of Python code. Apache - when you think you need to get rid of it, you don’t. Apache is a real rockstar technology at YouTube because they keep it simple. Every request goes through Apache. Linux - the benefit of Linux is there’s always a way to get in and see how your system is behaving. No matter how bad your app is behaving, you can take a look at it with Linux tools like strace and tcpdump. MySQL - is used a lot. When you watch a video you are getting data from MySQL. Sometime it’s used a relational database or a blob store. It’s about tuning and making choices about how you organize your data. Vitess- a  new project released by YouTube, written in Go, it’s a frontend to MySQL. It does a lot of optimization on the fly, it rewrites queries and acts as a proxy. Currently it serves every YouTube database request. It’s RPC based. Zookeeper - a distributed lock server. It’s used for configuration. Really interesting piece of technology. Hard to use correctly so read the manual Wiseguy - a CGI servlet container. Spitfire - a templating system. It has an abstract syntax tree that let’s them do transformations to make things go faster. Serialization formats - no matter which one you use, they are all expensive. Measure. Don’t use pickle. Not a good choice. Found protocol buffers slow. They wrote their own BSON implementation, which is 10-15 time faster than the one you can download. ...Contiues. Read the blog Watch the video

    Read the article

  • Speeding up a search .net 4.0

    - by user231465
    Wondering if I can speed up the search. I need to build a functionality that has to be used by many UI screens The one I have got works but I need to make sure I am implementing a fast algoritim if you like It's like an incremental search. User types a word to search for eg const string searchFor = "Guinea"; const char nextLetter = ' ' It looks in the list and returns 2 records "Guinea and Guinea Bissau " User types a word to search for eg const string searchFor = "Gu"; const char nextLetter = 'i' returns 3 results. This is the function but I would like to speed it up. Is there a pattern for this kind of search? class Program { static void Main() { //Find all countries that begin with string + a possible letter added to it //const string searchFor = "Guinea"; //const char nextLetter = ' '; //returns 2 results const string searchFor = "Gu"; const char nextLetter = 'i'; List<string> result = FindPossibleMatches(searchFor, nextLetter); result.ForEach(x=>Console.WriteLine(x)); //returns 3 results Console.Read(); } /// <summary> /// Find all possible matches /// </summary> /// <param name="searchFor">string to search for</param> /// <param name="nextLetter">pretend user as just typed a letter</param> /// <returns></returns> public static List<string> FindPossibleMatches (string searchFor, char nextLetter) { var hashedCountryList = new HashSet<string>(CountriesList()); var result=new List<string>(); IEnumerable<string> tempCountryList = hashedCountryList.Where(x => x.StartsWith(searchFor)); foreach (string item in tempCountryList) { string tempSearchItem; if (nextLetter == ' ') { tempSearchItem = searchFor; } else { tempSearchItem = searchFor + nextLetter; } if(item.StartsWith(tempSearchItem)) { result.Add(item); } } return result; } /// <summary> /// Returns list of countries. /// </summary> public static string[] CountriesList() { return new[] { "Afghanistan", "Albania", "Algeria", "American Samoa", "Andorra", "Angola", "Anguilla", "Antarctica", "Antigua And Barbuda", "Argentina", "Armenia", "Aruba", "Australia", "Austria", "Azerbaijan", "Bahamas", "Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize", "Benin", "Bermuda", "Bhutan", "Bolivia", "Bosnia Hercegovina", "Botswana", "Bouvet Island", "Brazil", "Brunei Darussalam", "Bulgaria", "Burkina Faso", "Burundi", "Byelorussian SSR", "Cambodia", "Cameroon", "Canada", "Cape Verde", "Cayman Islands", "Central African Republic", "Chad", "Chile", "China", "Christmas Island", "Cocos (Keeling) Islands", "Colombia", "Comoros", "Congo", "Cook Islands", "Costa Rica", "Cote D'Ivoire", "Croatia", "Cuba", "Cyprus", "Czech Republic", "Czechoslovakia", "Denmark", "Djibouti", "Dominica", "Dominican Republic", "East Timor", "Ecuador", "Egypt", "El Salvador", "England", "Equatorial Guinea", "Eritrea", "Estonia", "Ethiopia", "Falkland Islands", "Faroe Islands", "Fiji", "Finland", "France", "Gabon", "Gambia", "Georgia", "Germany", "Ghana", "Gibraltar", "Great Britain", "Greece", "Greenland", "Grenada", "Guadeloupe", "Guam", "Guatemela", "Guernsey", "Guiana", "Guinea", "Guinea Bissau", "Guyana", "Haiti", "Heard Islands", "Honduras", "Hong Kong", "Hungary", "Iceland", "India", "Indonesia", "Iran", "Iraq", "Ireland", "Isle Of Man", "Israel", "Italy", "Jamaica", "Japan", "Jersey", "Jordan", "Kazakhstan", "Kenya", "Kiribati", "Korea, South", "Korea, North", "Kuwait", "Kyrgyzstan", "Lao People's Dem. Rep.", "Latvia", "Lebanon", "Lesotho", "Liberia", "Libya", "Liechtenstein", "Lithuania", "Luxembourg", "Macau", "Macedonia", "Madagascar", "Malawi", "Malaysia", "Maldives", "Mali", "Malta", "Mariana Islands", "Marshall Islands", "Martinique", "Mauritania", "Mauritius", "Mayotte", "Mexico", "Micronesia", "Moldova", "Monaco", "Mongolia", "Montserrat", "Morocco", "Mozambique", "Myanmar", "Namibia", "Nauru", "Nepal", "Netherlands", "Netherlands Antilles", "Neutral Zone", "New Caledonia", "New Zealand", "Nicaragua", "Niger", "Nigeria", "Niue", "Norfolk Island", "Northern Ireland", "Norway", "Oman", "Pakistan", "Palau", "Panama", "Papua New Guinea", "Paraguay", "Peru", "Philippines", "Pitcairn", "Poland", "Polynesia", "Portugal", "Puerto Rico", "Qatar", "Reunion", "Romania", "Russian Federation", "Rwanda", "Saint Helena", "Saint Kitts", "Saint Lucia", "Saint Pierre", "Saint Vincent", "Samoa", "San Marino", "Sao Tome and Principe", "Saudi Arabia", "Scotland", "Senegal", "Seychelles", "Sierra Leone", "Singapore", "Slovakia", "Slovenia", "Solomon Islands", "Somalia", "South Africa", "South Georgia", "Spain", "Sri Lanka", "Sudan", "Suriname", "Svalbard", "Swaziland", "Sweden", "Switzerland", "Syrian Arab Republic", "Taiwan", "Tajikista", "Tanzania", "Thailand", "Togo", "Tokelau", "Tonga", "Trinidad and Tobago", "Tunisia", "Turkey", "Turkmenistan", "Turks and Caicos Islands", "Tuvalu", "Uganda", "Ukraine", "United Arab Emirates", "United Kingdom", "United States", "Uruguay", "Uzbekistan", "Vanuatu", "Vatican City State", "Venezuela", "Vietnam", "Virgin Islands", "Wales", "Western Sahara", "Yemen", "Yugoslavia", "Zaire", "Zambia", "Zimbabwe" }; } } } Any suggestions? Thanks

    Read the article

  • Steganography Experiment - Trouble hiding message bits in DCT coefficients

    - by JohnHankinson
    I have an application requiring me to be able to embed loss-less data into an image. As such I've been experimenting with steganography, specifically via modification of DCT coefficients as the method I select, apart from being loss-less must also be relatively resilient against format conversion, scaling/DSP etc. From the research I've done thus far this method seems to be the best candidate. I've seen a number of papers on the subject which all seem to neglect specific details (some neglect to mention modification of 0 coefficients, or modification of AC coefficient etc). After combining the findings and making a few modifications of my own which include: 1) Using a more quantized version of the DCT matrix to ensure we only modify coefficients that would still be present should the image be JPEG'ed further or processed (I'm using this in place of simply following a zig-zag pattern). 2) I'm modifying bit 4 instead of the LSB and then based on what the original bit value was adjusting the lower bits to minimize the difference. 3) I'm only modifying the blue channel as it should be the least visible. This process must modify the actual image and not the DCT values stored in file (like jsteg) as there is no guarantee the file will be a JPEG, it may also be opened and re-saved at a later stage in a different format. For added robustness I've included the message multiple times and use the bits that occur most often, I had considered using a QR code as the message data or simply applying the reed-solomon error correction, but for this simple application and given that the "message" in question is usually going to be between 10-32 bytes I have plenty of room to repeat it which should provide sufficient redundancy to recover the true bits. No matter what I do I don't seem to be able to recover the bits at the decode stage. I've tried including / excluding various checks (even if it degrades image quality for the time being). I've tried using fixed point vs. double arithmetic, moving the bit to encode, I suspect that the message bits are being lost during the IDCT back to image. Any thoughts or suggestions on how to get this working would be hugely appreciated. (PS I am aware that the actual DCT/IDCT could be optimized from it's naive On4 operation using row column algorithm, or an FDCT like AAN, but for now it just needs to work :) ) Reference Papers: http://www.lokminglui.com/dct.pdf http://arxiv.org/ftp/arxiv/papers/1006/1006.1186.pdf Code for the Encode/Decode process in C# below: using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Drawing.Imaging; using System.Drawing; namespace ImageKey { public class Encoder { public const int HIDE_BIT_POS = 3; // use bit position 4 (1 << 3). public const int HIDE_COUNT = 16; // Number of times to repeat the message to avoid error. // JPEG Standard Quantization Matrix. // (to get higher quality multiply by (100-quality)/50 .. // for lower than 50 multiply by 50/quality. Then round to integers and clip to ensure only positive integers. public static double[] Q = {16,11,10,16,24,40,51,61, 12,12,14,19,26,58,60,55, 14,13,16,24,40,57,69,56, 14,17,22,29,51,87,80,62, 18,22,37,56,68,109,103,77, 24,35,55,64,81,104,113,92, 49,64,78,87,103,121,120,101, 72,92,95,98,112,100,103,99}; // Maximum qauality quantization matrix (if all 1's doesn't modify coefficients at all). public static double[] Q2 = {1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1}; public static Bitmap Encode(Bitmap b, string key) { Bitmap response = new Bitmap(b.Width, b.Height, PixelFormat.Format32bppArgb); uint imgWidth = ((uint)b.Width) & ~((uint)7); // Maximum usable X resolution (divisible by 8). uint imgHeight = ((uint)b.Height) & ~((uint)7); // Maximum usable Y resolution (divisible by 8). // Start be transferring the unmodified image portions. // As we'll be using slightly less width/height for the encoding process we'll need the edges to be populated. for (int y = 0; y < b.Height; y++) for (int x = 0; x < b.Width; x++) { if( (x >= imgWidth && x < b.Width) || (y>=imgHeight && y < b.Height)) response.SetPixel(x, y, b.GetPixel(x, y)); } // Setup the counters and byte data for the message to encode. StringBuilder sb = new StringBuilder(); for(int i=0;i<HIDE_COUNT;i++) sb.Append(key); byte[] codeBytes = System.Text.Encoding.ASCII.GetBytes(sb.ToString()); int bitofs = 0; // Current bit position we've encoded too. int totalBits = (codeBytes.Length * 8); // Total number of bits to encode. for (int y = 0; y < imgHeight; y += 8) { for (int x = 0; x < imgWidth; x += 8) { int[] redData = GetRedChannelData(b, x, y); int[] greenData = GetGreenChannelData(b, x, y); int[] blueData = GetBlueChannelData(b, x, y); int[] newRedData; int[] newGreenData; int[] newBlueData; if (bitofs < totalBits) { double[] redDCT = DCT(ref redData); double[] greenDCT = DCT(ref greenData); double[] blueDCT = DCT(ref blueData); int[] redDCTI = Quantize(ref redDCT, ref Q2); int[] greenDCTI = Quantize(ref greenDCT, ref Q2); int[] blueDCTI = Quantize(ref blueDCT, ref Q2); int[] blueDCTC = Quantize(ref blueDCT, ref Q); HideBits(ref blueDCTI, ref blueDCTC, ref bitofs, ref totalBits, ref codeBytes); double[] redDCT2 = DeQuantize(ref redDCTI, ref Q2); double[] greenDCT2 = DeQuantize(ref greenDCTI, ref Q2); double[] blueDCT2 = DeQuantize(ref blueDCTI, ref Q2); newRedData = IDCT(ref redDCT2); newGreenData = IDCT(ref greenDCT2); newBlueData = IDCT(ref blueDCT2); } else { newRedData = redData; newGreenData = greenData; newBlueData = blueData; } MapToRGBRange(ref newRedData); MapToRGBRange(ref newGreenData); MapToRGBRange(ref newBlueData); for(int dy=0;dy<8;dy++) { for(int dx=0;dx<8;dx++) { int col = (0xff<<24) + (newRedData[dx+(dy*8)]<<16) + (newGreenData[dx+(dy*8)]<<8) + (newBlueData[dx+(dy*8)]); response.SetPixel(x+dx,y+dy,Color.FromArgb(col)); } } } } if (bitofs < totalBits) throw new Exception("Failed to encode data - insufficient cover image coefficients"); return (response); } public static void HideBits(ref int[] DCTMatrix, ref int[] CMatrix, ref int bitofs, ref int totalBits, ref byte[] codeBytes) { int tempValue = 0; for (int u = 0; u < 8; u++) { for (int v = 0; v < 8; v++) { if ( (u != 0 || v != 0) && CMatrix[v+(u*8)] != 0 && DCTMatrix[v+(u*8)] != 0) { if (bitofs < totalBits) { tempValue = DCTMatrix[v + (u * 8)]; int bytePos = (bitofs) >> 3; int bitPos = (bitofs) % 8; byte mask = (byte)(1 << bitPos); byte value = (byte)((codeBytes[bytePos] & mask) >> bitPos); // 0 or 1. if (value == 0) { int a = DCTMatrix[v + (u * 8)] & (1 << HIDE_BIT_POS); if (a != 0) DCTMatrix[v + (u * 8)] |= (1 << HIDE_BIT_POS) - 1; DCTMatrix[v + (u * 8)] &= ~(1 << HIDE_BIT_POS); } else if (value == 1) { int a = DCTMatrix[v + (u * 8)] & (1 << HIDE_BIT_POS); if (a == 0) DCTMatrix[v + (u * 8)] &= ~((1 << HIDE_BIT_POS) - 1); DCTMatrix[v + (u * 8)] |= (1 << HIDE_BIT_POS); } if (DCTMatrix[v + (u * 8)] != 0) bitofs++; else DCTMatrix[v + (u * 8)] = tempValue; } } } } } public static void MapToRGBRange(ref int[] data) { for(int i=0;i<data.Length;i++) { data[i] += 128; if(data[i] < 0) data[i] = 0; else if(data[i] > 255) data[i] = 255; } } public static int[] GetRedChannelData(Bitmap b, int sx, int sy) { int[] data = new int[8 * 8]; for (int y = sy; y < (sy + 8); y++) { for (int x = sx; x < (sx + 8); x++) { uint col = (uint)b.GetPixel(x,y).ToArgb(); data[(x - sx) + ((y - sy) * 8)] = (int)((col >> 16) & 0xff) - 128; } } return (data); } public static int[] GetGreenChannelData(Bitmap b, int sx, int sy) { int[] data = new int[8 * 8]; for (int y = sy; y < (sy + 8); y++) { for (int x = sx; x < (sx + 8); x++) { uint col = (uint)b.GetPixel(x, y).ToArgb(); data[(x - sx) + ((y - sy) * 8)] = (int)((col >> 8) & 0xff) - 128; } } return (data); } public static int[] GetBlueChannelData(Bitmap b, int sx, int sy) { int[] data = new int[8 * 8]; for (int y = sy; y < (sy + 8); y++) { for (int x = sx; x < (sx + 8); x++) { uint col = (uint)b.GetPixel(x, y).ToArgb(); data[(x - sx) + ((y - sy) * 8)] = (int)((col >> 0) & 0xff) - 128; } } return (data); } public static int[] Quantize(ref double[] DCTMatrix, ref double[] Q) { int[] DCTMatrixOut = new int[8*8]; for (int u = 0; u < 8; u++) { for (int v = 0; v < 8; v++) { DCTMatrixOut[v + (u * 8)] = (int)Math.Round(DCTMatrix[v + (u * 8)] / Q[v + (u * 8)]); } } return(DCTMatrixOut); } public static double[] DeQuantize(ref int[] DCTMatrix, ref double[] Q) { double[] DCTMatrixOut = new double[8*8]; for (int u = 0; u < 8; u++) { for (int v = 0; v < 8; v++) { DCTMatrixOut[v + (u * 8)] = (double)DCTMatrix[v + (u * 8)] * Q[v + (u * 8)]; } } return(DCTMatrixOut); } public static double[] DCT(ref int[] data) { double[] DCTMatrix = new double[8 * 8]; for (int v = 0; v < 8; v++) { for (int u = 0; u < 8; u++) { double cu = 1; if (u == 0) cu = (1.0 / Math.Sqrt(2.0)); double cv = 1; if (v == 0) cv = (1.0 / Math.Sqrt(2.0)); double sum = 0.0; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { double s = data[x + (y * 8)]; double dctVal = Math.Cos((2 * y + 1) * v * Math.PI / 16) * Math.Cos((2 * x + 1) * u * Math.PI / 16); sum += s * dctVal; } } DCTMatrix[u + (v * 8)] = (0.25 * cu * cv * sum); } } return (DCTMatrix); } public static int[] IDCT(ref double[] DCTMatrix) { int[] Matrix = new int[8 * 8]; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { double sum = 0; for (int v = 0; v < 8; v++) { for (int u = 0; u < 8; u++) { double cu = 1; if (u == 0) cu = (1.0 / Math.Sqrt(2.0)); double cv = 1; if (v == 0) cv = (1.0 / Math.Sqrt(2.0)); double idctVal = (cu * cv) / 4.0 * Math.Cos((2 * y + 1) * v * Math.PI / 16) * Math.Cos((2 * x + 1) * u * Math.PI / 16); sum += (DCTMatrix[u + (v * 8)] * idctVal); } } Matrix[x + (y * 8)] = (int)Math.Round(sum); } } return (Matrix); } } public class Decoder { public static string Decode(Bitmap b, int expectedLength) { expectedLength *= Encoder.HIDE_COUNT; uint imgWidth = ((uint)b.Width) & ~((uint)7); // Maximum usable X resolution (divisible by 8). uint imgHeight = ((uint)b.Height) & ~((uint)7); // Maximum usable Y resolution (divisible by 8). // Setup the counters and byte data for the message to decode. byte[] codeBytes = new byte[expectedLength]; byte[] outBytes = new byte[expectedLength / Encoder.HIDE_COUNT]; int bitofs = 0; // Current bit position we've decoded too. int totalBits = (codeBytes.Length * 8); // Total number of bits to decode. for (int y = 0; y < imgHeight; y += 8) { for (int x = 0; x < imgWidth; x += 8) { int[] blueData = ImageKey.Encoder.GetBlueChannelData(b, x, y); double[] blueDCT = ImageKey.Encoder.DCT(ref blueData); int[] blueDCTI = ImageKey.Encoder.Quantize(ref blueDCT, ref Encoder.Q2); int[] blueDCTC = ImageKey.Encoder.Quantize(ref blueDCT, ref Encoder.Q); if (bitofs < totalBits) GetBits(ref blueDCTI, ref blueDCTC, ref bitofs, ref totalBits, ref codeBytes); } } bitofs = 0; for (int i = 0; i < (expectedLength / Encoder.HIDE_COUNT) * 8; i++) { int bytePos = (bitofs) >> 3; int bitPos = (bitofs) % 8; byte mask = (byte)(1 << bitPos); List<int> values = new List<int>(); int zeroCount = 0; int oneCount = 0; for (int j = 0; j < Encoder.HIDE_COUNT; j++) { int val = (codeBytes[bytePos + ((expectedLength / Encoder.HIDE_COUNT) * j)] & mask) >> bitPos; values.Add(val); if (val == 0) zeroCount++; else oneCount++; } if (oneCount >= zeroCount) outBytes[bytePos] |= mask; bitofs++; values.Clear(); } return (System.Text.Encoding.ASCII.GetString(outBytes)); } public static void GetBits(ref int[] DCTMatrix, ref int[] CMatrix, ref int bitofs, ref int totalBits, ref byte[] codeBytes) { for (int u = 0; u < 8; u++) { for (int v = 0; v < 8; v++) { if ((u != 0 || v != 0) && CMatrix[v + (u * 8)] != 0 && DCTMatrix[v + (u * 8)] != 0) { if (bitofs < totalBits) { int bytePos = (bitofs) >> 3; int bitPos = (bitofs) % 8; byte mask = (byte)(1 << bitPos); int value = DCTMatrix[v + (u * 8)] & (1 << Encoder.HIDE_BIT_POS); if (value != 0) codeBytes[bytePos] |= mask; bitofs++; } } } } } } } UPDATE: By switching to using a QR Code as the source message and swapping a pair of coefficients in each block instead of bit manipulation I've been able to get the message to survive the transform. However to get the message to come through without corruption I have to adjust both coefficients as well as swap them. For example swapping (3,4) and (4,3) in the DCT matrix and then respectively adding 8 and subtracting 8 as an arbitrary constant seems to work. This survives a re-JPEG'ing of 96 but any form of scaling/cropping destroys the message again. I was hoping that by operating on mid to low frequency values that the message would be preserved even under some light image manipulation.

    Read the article

< Previous Page | 1 2