Search Results

Search found 28 results on 2 pages for 'y nk'.

Page 1/2 | 1 2 | Next Page >

Patch an Existing NK.BIN

- by Kate Moss' Open Space

As you know, we can use MAKEIMG.EXE tool to create OS Image file, NK.BIN, or ROMIMAGE.EXE with a BIB for more accurate. But what if the image file is already created but need to be patched or you want to extract a file from NK.BIN? The Platform Builder provide many useful command line utilities, and today I am going to introduce one, BINMOD.EXE. http://msdn.microsoft.com/en-us/library/ee504622.aspx is the official page for BINMOD tool. As the page says, The BinMod Tool (binmod.exe) extracts files from a run-time image, and replaces files in a run-time image and its usage binmod [-i imagename] [-r replacement_filename.ext | -e extraction_filename.ext] This is a simple tool and is easy to use, if we want to extract a file from nk.bin, just type binmod –i nk.bin –e filename.ext And that's it! Or use can try -r command to replace a file inside NK.BIN. The small tool is good but there is a limitation; due to the files in MODULES section are fixed up during ROMIMAGE so the original file format is not preserved, therefore extract or replace file in MODULE section will be impossible. So just like this small tool, this post supposed to be end here, right? Nah... It is not that easy. Just try the above example, and you will find, the tool is not work! Double check the file is in FILES section and the NK.BIN is good, but it just quits. Before you throw away this useless toy, we can try to fix it! Yes, the source of this tool is available in your CE6, private\winceos\COREOS\nk\tools\romimage\binmod. As it is a tool run in your Windows so you need to Windows SDK or Visual Studio to build the code. (I am going to save you some time by skipping the detail as building a desktop console mode program is fairly trivial) The cbinmod.cpp is the core logic for this program and follow up the error message we got, it looks like the following code is suspected. // // Extra sanity check... // if((DWORD)(HIWORD(pTOCLoc->dllfirst) << 16) <= pTOCLoc->dlllast && (DWORD)(LOWORD(pTOCLoc->dllfirst) << 16) <= pTOCLoc->dlllast) { dprintf("Found pTOC = 0x%08x\n", (DWORD)dwpTOC); fFoundIt = true; break; } else { dprintf("NOTICE! Record %d looked like a TOC except DLL first = 0x%08X, and DLL last = 0x%08X\r\n", i, pTOCLoc->dllfirst, pTOCLoc->dlllast); } The logic checks if dllfirst <= dlllast but look closer, the code only separated the high/low WORD from dllfirst but does not apply the same to dlllast, is that on purpose or a bug? While the TOC is created by ROMIMAGE.EXE, so let's move to ROMIMAGE. In private\winceos\coreos\nk\tools\romimage\romimage\bin.cpp Module::s_romhdr.dllfirst = (HIWORD(xip_mem->dll_data_bottom) << 16) | HIWORD(xip_mem->kernel_dll_bottom); Module::s_romhdr.dlllast = (HIWORD(xip_mem->dll_data_top) << 16) | HIWORD(xip_mem->kernel_dll_top); It is clear now, the high word of dll first is the upper 16 bits of XIP DLL bottom and the low word is the upper 16 bits of kernel dll bottom; also, the high word of dll last is the upper 16 bits of XIP DLL top and the low word is the upper 16 bits of kernel dll top. Obviously, the correct statement should be if((DWORD)(HIWORD(pTOCLoc->dllfirst) << 16) <= (DWORD)(HIWORD(pTOCLoc->dlllast) << 16) && (DWORD)(LOWORD(pTOCLoc->dllfirst) << 16) <= (DWORD)(LOWORD(pTOCLoc->dlllast) << 16)) So update the code like this should fix this issue or just like the comment, it is an extra sanity check, you can just get rid of it, either way can make the code moving forward and everything worked as advertised. "Extracting out copies of files from the nk.bin... replacing files... etc." Since the NK.BIN can be compressed, so the BinMod needs the compress.dll to decompress the data, the DLL can be found in C:\program files\microsoft platform builder\6.00\cepb\idevs\imgutils.

Read the article
Build Your Own CE6 Kernel

- by Kate Moss' Big Fan

The Share Source Program in Windows CE provides many modules in %_WINCEROOT%\Private\ tree, and the kernel is one of them! Although it is not full source of kernel but it is good enough for tracing it, even tweak the kernel. Tracing the kernel and see how it works is lots of fun, but it is fascinated to modify and verify the change you made. So first comes first, where is the source of kernel? It's in your %_WINCEROOT%\private\winceos\COREOS\nk\ And next question will be "How do I build it?", Some of you may say just "build -c" there and it should be good. If you are the owner of kernel and got full source, that is definitely the right answer, but none of them are applied to our case though. So what should I do? Let's dig deeper into the coreos\nk folder, there are a couples of subfolder, CELOG, KDSTUB, KERNEL and etc. KERNEL\ is the main component of kernel.dll, in the other word, most of the modify to kernel is going to happen here. And the good thing is, you could "build -c" in %_WINCEROOT%\private\winceos\COREOS\nk\kernel\ with no error at all. But before doing that, remember to backup eveything you are going to modify, including the source and binaries; remember, this is not something belong to you, and if you didn't restore them back later, it could end up confuse the subsequence QFE updates! Here is the steps Backup the source code, I will suggest the whole %_WINCEROOT%\private\winceos\COREOS\nk\ Backup the binaries in common\oak\lib\, and again if you are not sure which files, backup the whole %_WINCEROOT%\common\oak\lib\ is the safest way. Do whatever modification you want in %_WINCEROOT%\private\winceos\COREOS\nk\kernel\ build -c in %_WINCEROOT%\private\winceos\COREOS\nk\kernel If everything went well so far, you should get a new nkmain.lib,nkmain.pdb, nkprmain.lib and nkprmain.pdb in %_WINCEROOT%\public\common\oak\lib\%_TGTCPU%\%WINCEDEBUG%\ Basically, you just rebuild your new kernel, the rest is to "blddemo clean -q" to have your new kernel SYSGEN'd and include in your OS Image. Or just "set WINCEREL=1" then "sysgen -p common nk nkprof" and "makeimg" if you can't wait another minutes for "blddemo clean -q" Tat sounds good, but some of you may not like the idea to alter any code in private folder, and not to mention how annoying to backup/restore files every time. Better idea? Yes, Microsoft provides a tool SYSGEN_CAPTURE (http://msdn.microsoft.com/en-us/library/ee504678.aspx for detail and usage) to creates Sources files for public drivers that you want to modify and build in your platform directory. In fact, not only public drivers, virtually anything in the %_WINCEROOT%\public\<project name>\cesysgen\makefile can be captured, and of course including kernel. So I am going to introduce a second way to build your own kernel by using SYSGEN_CAPTURE tool. Again the steps Create a folder in your BSP for building kernel, says %_TARGETPLATROOT%\SRC\Kernel. Use "SYSGEN_CAPTURE -p common nk" and then you will get a SOURCES.KERN, you could also "SYSGEN_CAPTURE -p common nkprof" to generate profiler enabled kernel. rename the SOURCE.KERN to SOURCES and copy one of the sample makefile into your kernel directory. For example the one in PRIVATE\WINCEOS\COREOS\NK\KERNEL\NKNORMAL. Copy the source files you want to modify from private\winceos\coreos\nk\kernel\ into your kernel directory. Modifying the SOURCES= macro to the source files you addes in step 4. For example, if you copied the vm.c, it is going to be SOURCES=vm.c Refer to the private\winceos\COREOS\nk\kernel\sources.inc and add macro defines and proper include path in your SOURCES file. "set WINCEREL=1", "build -c" in your kernel directory and "makeimg", voila! Here is an example for the MACROS you need to add in x86 Here are the macros for x86 CDEFINES=$(CDEFINES) -DIN_KERNEL -DWINCEMACRO -DKERN_CORE # Machine independent defines CDEFINES=$(CDEFINES) -DDBGSUPPORT _COREOSROOT=$(_WINCEROOT)\private\winceos\coreos INCLUDES=$(_COREOSROOT)\inc;$(_COREOSROOT)\nk\inc !IFDEF DP_SETTINGS CDEFINES=$(CDEFINES) -DDP_SETTINGS=$(DP_SETTINGS) !ENDIF ASM_SAFESEH=1 CDEFINES=$(CDEFINES) -Gs100000 -DENCODE_GS_COOKIE

Read the article
Windows CE 5.0 image building: Possible without Platform Builder?

- by developer

Is it possible to create Windows CE 5.0 images (ie: nk.bin) from VS2005/VS2008 without using Platform Builder? If so, how? Can a vendor BSP for WinCE 5 be loaded into VS2005/2008? Are there the parts to do this available for download from Microsoft (ie: the SDK), or must you buy the special bits (a la PB) from a "special distributor"? I know it is possible to build binaries (.dll, .exe) for WinCE 5.0 using VS, my question is about creating entire bootable CE 5.0 images for embedded platforms.

Read the article
Generate a merge statement from table structure

- by Nigel Rivett

This code generates a merge statement joining on he natural key and checking all other columns to see if they have changed. The full version deals with type 2 processing and an audit trail but this version is useful. Just the insert or update part is handy too. Change the table at the top (spt_values in master in the version) and the join columns for the merge in @nk. The output generated is at the top and the code to run to generate it below. Output merge spt_values a using spt_values b on a.name = b.name and a.number = b.number and a.type = b.type when matched and (1=0 or (a.low b.low) or (a.low is null and b.low is not null) or (a.low is not null and b.low is null) or (a.high b.high) or (a.high is null and b.high is not null) or (a.high is not null and b.high is null) or (a.status b.status) or (a.status is null and b.status is not null) or (a.status is not null and b.status is null) ) then update set low = b.low , high = b.high , status = b.status when not matched by target then insert ( name , number , type , low , high , status ) values ( b.name , b.number , b.type , b.low , b.high , b.status ); Generator set nocount on declare @t varchar(128) = 'spt_values' declare @i int = 0 -- this is the natural key on the table used for the merge statement join declare @nk table (ColName varchar(128)) insert @nk select 'Number' insert @nk select 'Name' insert @nk select 'Type' declare @cols table (seq int, nkseq int, type int, colname varchar(128)) ;with cte as ( select ordinal_position, type = case when columnproperty(object_id(@t), COLUMN_NAME,'IsIdentity') = 1 then 3 when nk.ColName is not null then 1 else 0 end, COLUMN_NAME from information_schema.columns c left join @nk nk on c.column_name = nk.ColName where table_name = @t ) insert @cols (seq, nkseq, type, colname) select ordinal_position, row_number() over (partition by type order by ordinal_position) , type, COLUMN_NAME from cte declare @result table (i int, j int, k int, data varchar(500)) select @i = @i + 1 insert @result (i, data) select @i, 'merge ' + @t + ' a' select @i = @i + 1 insert @result (i, data) select @i, ' using cte b' select @i = @i + 1 insert @result (i, j, data) select @i, nkseq, ' ' + case when nkseq = 1 then 'on' else 'and' end + ' a.' + ColName + ' = b.' + ColName from @cols where type = 1 select @i = @i + 1 insert @result (i, data) select @i, ' when matched and (1=0' select @i = @i + 1 insert @result (i, j, k, data) select @i, seq, 1, ' or (a.' + ColName + ' b.' + ColName + ')' + ' or (a.' + ColName + ' is null and b.' + ColName + ' is not null)' + ' or (a.' + ColName + ' is not null and b.' + ColName + ' is null)' from @cols where type 1 select @i = @i + 1 insert @result (i, data) select @i, ' )' select @i = @i + 1 insert @result (i, data) select @i, ' then update set' select @i = @i + 1 insert @result (i, j, data) select @i, nkseq, ' ' + case when nkseq = 1 then ' ' else ', ' end + colname + ' = b.' + colname from @cols where type = 0 select @i = @i + 1 insert @result (i, data) select @i, ' when not matched by target then insert' select @i = @i + 1 insert @result (i, data) select @i, ' (' select @i = @i + 1 insert @result (i, j, data) select @i, seq, ' ' + case when seq = 1 then ' ' else ', ' end + colname from @cols where type 3 select @i = @i + 1 insert @result (i, data) select @i, ' )' select @i = @i + 1 insert @result (i, data) select @i, ' values' select @i = @i + 1 insert @result (i, data) select @i, ' (' select @i = @i + 1 insert @result (i, j, data) select @i, seq, ' ' + case when seq = 1 then ' ' else ', ' end + 'b.' + colname from @cols where type 3 select @i = @i + 1 insert @result (i, data) select @i, ' );' select data from @result order by i,j,k,data

Read the article
Encode H.264 stream in Internet Explorer

- by NK

I want to encode H.264 stream into IE. For that which file format/container i have to use? Please guide me. Thanks NK

Read the article
SSIS Lookup with Lookup Component Vs Script Component.

- by Nev_Rahd

Hello, I need to load Dimensions from EDW Tables (which does maintain historical records) and is of type Key-Value-Parameter. My scenario is ok if got a record in EDW as below Key1 Key2 Code Value EffectiveDate EndDate CurrentFlag 100 555 01 AAA 2010-01-01 11.00.00 9999-12-31 Y 100 555 02 BBB 2010-01-01 11.00.00 9999-12-31 Y This need to be loaded into DM by pivoting it as key1 and key2 combinations makes Natural key for DM SK NK 01 02 EffectiveDate EndDate CurrentFlag 1 100-555 AAA BBB 2010-01-01 11.00.00 9999-12-31 Y My ssis package does this all good pivoting... looking up the incoming NK in DIM.. if new will insert .. else with further lookup with effective date and determine if the incoming for same natural key got any new (change) in attribute.. if so updates the current record byy setting its end date and insert the new one with new attribute value and pulling the recent records values for other attributes. My problem is if the same natural key comes twice with same attribute in single extract my first lookup which on natural key .. will let both records pass and try to insert.. where its fails. If i get distinct records on NK the second is not picked and need to run package again. So my question how can i configure lookup or alernative way to handle this scenario when same NK comes twice in single extract, would be able to insert first record if not exists in Dim table and for second one should be able to updated with the changes with reference to one inserted above. Not sure this makes sense what am trying to explain. Will attached the screenshot once back to work desk (on monday). Thanks

Read the article
Log Debug Messages without Debug Serial on Shipped Device

- by Kate Moss' Open Space

Debug message is one of the ancient but useful way for problem resolving. Message is redirected to PB if KITL is enabled otherwise it goes to default debug port, usually a serial port on most of the platform but it really depends on how OEMWriteDebugString and OEMWriteDebugByte are implemented. For many reasons, we don't want to have a debug serial port, for example, we don't have enough spare serial ports and it can affect the performance. So some of the BSP designers decide to dump the messages into other media, could be a log file, shared memory or any solution that is suitable for the need. In CE 5.0 and previous, OAL and Kernel are linked into one binaries; in the other word, you can use whatever function in kernel, such as SC_CreateFileW to access filesystem in OAL, even this is strongly not recommended. But since the OAL is being a standalone executable in CE 6.0, we no longer can use this back door but only interface exported in NKGlobal which just provides enough for OAL but no more. Accessing filesystem or using sync object to communicate to other drivers or application is even not an option. Sounds like the kernel lock itself up; of course, OAL is in kernel space, you can still do whatever you want to hack into kernel, but once again, it is not only make it a dirty solution but also fragile. So isn't there an elegant solution? Let's see how a debug message print out. In private\winceos\COREOS\nk\kernel\printf.c, the OutputDebugStringW is the one for pumping out the messages; most of the code is for error handling and serialization but what really interesting is the following code piece if (g_cInterruptsOff) { OEMWriteDebugString ((unsigned short *)str); } else { g_pNKGlobal->pfnWriteDebugString ((unsigned short *)str); } CELOG_OutputDebugString(dwActvProcId, dwCurThId, str); It outputs the message to default debug output (is redirected to KITL when available) or OAL when needed but note that highlight part, it also invokes CELOG_OutputDebugString. Follow the thread to private\winceos\COREOS\nk\logger\CeLogInstrumentation.c, this function dump whatever input to CELOG. So whatever the debug message is we always got a clone in CELOG. General speaking, all of the debug message is logged to CELOG already, so what you need to do is using celogflush.exe with CELZONE_DEBUG zone, and then viewing the data using the by Readlog tool. Here are some information about these tools CELOG - http://msdn.microsoft.com/en-us/library/ee479818.aspx READLOG - http://msdn.microsoft.com/en-us/library/ee481220.aspx Also for advanced reader, I encourage you to dig into private\winceos\COREOS\nk\celog\celogdll, the source of CELOG.DLL and use it as a starting point to create a more lightweight debug message logger for your own device!

Read the article
convert bitset to string ??

- by mr.bio

Hi .. What is wrong with this code ? set<string> nk ; bitset<3> bs1(string("100")); nk.insert(bs1.to_string()); error: no matching function for call to `std::bitset<3u::to_string()' why?!

Read the article
Reading data from database and binding them to custom ListView

- by N.K.

I try to read data from a database i have made and to show some of the data in a row at a custom ListView. I can not understand what is my mistake. This is my code: public class EsodaMainActivity extends Activity { public static final String ROW_ID = "row_id"; //Intent extra key private ListView esodaListView; // the ListActivitys ListView private SimpleCursorAdapter esodaAdapter; // adapter for ListView DatabaseConnector databaseConnector = new DatabaseConnector(EsodaMainActivity.this); // called when the activity is first created @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_esoda_main); esodaListView = (ListView)findViewById(R.id.esodaList); esodaListView.setOnItemClickListener(viewEsodaListener); databaseConnector.open(); //Cursor cursor= databaseConnector.query("esoda", new String[] // {"name", "amount"}, null,null,null); Cursor cursor=databaseConnector.getAllEsoda(); startManagingCursor(cursor); // map each esoda to a TextView in the ListView layout // The desired columns to be bound String[] from = new String[] {"name","amount"}; // built an String array named "from" //The XML defined views which the data will be bound to int[] to = new int[] { R.id.esodaTextView, R.id.amountTextView}; // built an int array named "to" // EsodaMainActivity.this = The context in which the ListView is running // R.layout.esoda_list_item = Id of the layout that is used to display each item in ListView // null = // from = String array containing the column names to display // to = Int array containing the column names to display esodaAdapter = new SimpleCursorAdapter (this, R.layout.esoda_list_item, cursor, from, to); esodaListView.setAdapter(esodaAdapter); // set esodaView's adapter } // end of onCreate method @Override protected void onResume() { super.onResume(); // call super's onResume method // create new GetEsodaTask and execute it // GetEsodaTask is an AsyncTask object new GetEsodaTask().execute((Object[]) null); } // end of onResume method // onStop method is executed when the Activity is no longer visible to the user @Override protected void onStop() { Cursor cursor= esodaAdapter.getCursor(); // gets current cursor from esodaAdapter if (cursor != null) cursor.deactivate(); // deactivate cursor esodaAdapter.changeCursor(null); // adapter now has no cursor (removes the cursor from the CursorAdapter) super.onStop(); } // end of onStop method // this class performs db query outside the GUI private class GetEsodaTask extends AsyncTask<Object, Object, Cursor> { // we create a new DatabaseConnector obj // EsodaMainActivity.this = Context DatabaseConnector databaseConnector = new DatabaseConnector(EsodaMainActivity.this); // perform the db access @Override protected Cursor doInBackground(Object... params) { databaseConnector.open(); // get a cursor containing call esoda return databaseConnector.getAllEsoda(); // the cursor returned by getAllContacts() is passed to method onPostExecute() } // end of doInBackground method // here we use the cursor returned from the doInBackground() method @Override protected void onPostExecute(Cursor result) { esodaAdapter.changeCursor(result); // set the adapter's Cursor databaseConnector.close(); } // end of onPostExecute() method } // end of GetEsodaTask class // creates the Activity's menu from a menu resource XML file @Override public boolean onCreateOptionsMenu(Menu menu) { super.onCreateOptionsMenu(menu); MenuInflater inflater = getMenuInflater(); inflater.inflate(R.menu.esoda_menu, menu); // inflates(eµf?s?) esodamainactivity_menu.xml to the Options menu return true; } // end of onCreateOptionsMenu() method //handles choice from options menu - is executed when the user touches a MenuItem @Override public boolean onOptionsItemSelected(MenuItem item) { // creates a new Intent to launch the AddEditEsoda Activity // EsodaMainActivity.this = Context from which the Activity will be launched // AddEditEsoda.class = target Activity Intent addNewEsoda = new Intent(EsodaMainActivity.this, AddEditEsoda.class); startActivity(addNewEsoda); return super.onOptionsItemSelected(item); } // end of method onPtionsItemSelected() // event listener that responds to the user touching a esoda's name in the ListView OnItemClickListener viewEsodaListener = new OnItemClickListener() { @Override public void onItemClick(AdapterView<?> arg0, View arg1, int arg2, long arg3) { // create an intent to launch the ViewEsoda Activity Intent viewEsoda = new Intent(EsodaMainActivity.this, ViewEsoda.class); // pass the selected esoda's row ID as an extra with the Intent viewEsoda.putExtra(ROW_ID, arg3); startActivity(viewEsoda); // start viewEsoda.class Activity } // end of onItemClick() method }; // end of viewEsodaListener } // end of EsodaMainActivity class The statement: Cursor cursor=databaseConnector.getAllEsoda(); queries all data (columns) From the data I want to show at my custom ListView 2 of them: "name" and "amount". But I still get a debugger error. Please help.

Read the article
bitset to dynamic bitset

- by mr.bio

Hi.. I have a function where i use bitset.Now i need to convert it to a dynamic bitset.. but i don't know how. Can somebody help me ? set<string> generateCandidates(set<string> ck,unsigned int k){ set<string> nk ; for (set<string>::const_iterator p = ck.begin( );p != ck.end( ); ++p){ for (set<string>::const_iterator q = ck.begin( );q != ck.end( ); ++q){ bitset<4> bs1(*p); bitset<4> bs2(*q); bs1|= bs2 ; if(bs1.count() == k){ nk.insert(bs1.to_string<char,char_traits<char>,allocator<char> >()); } } } return nk; }

Read the article
Loading Dimension Tables - Methodologies

- by Nev_Rahd

Hello, Recently I been working on project, where need to populated Dim Tables from EDW Tables. EDW Tables are of type II which does maintain historical data. When comes to load Dim Table, for which source may be multiple EDW Tables or would be single table with multi level pivoting (on attributes). Mean: There would be 10 records - one for each attribute which need to be pivoted on domain_code to make a single row in Dim. Out of these 10 records there would be some attributes with same domain_code but with different sub_domain_code, which needs further pivoting on subdomain code. Ex: if i got domain code: 01,02, 03 = which are straight pivot on domain code I would also have domain code: 10 with subdomain code / version as 2006,2007,2008,2009 That means I need to split my source table with above attributes into two = one for domain code and other for domain_code + version. so far so good. When it comes to load Dim Table: As per design specs for Dimensions (originally written by third party), what they want is: for every single change in EDW (attribute), it should assemble all the related records (for that NK) mean new one with other attribute values which are current = process them to create a new dim record and insert it. That mean if a single extract contains 100 records updated (one for each NK), it should assemble 100 + (100*9) records to insert / update dim table. How good is this approach. Other way I tried to do is just do a lookup into dim table for that NK get the value's of recent records (attributes which not changed) and insert it and update the current one. What would be the better approach assembling records at source side for one attribute change or looking into dim table's recent record and process it. If this doesn't make sense, would like to elaborate it further. Thanks

Read the article
Windows CE Chat Transcript (March 30, 2010)

- by Bruce Eitman

For those of you who missed the chat today, here is the raw transcript. By raw, I mean that I copied and pasted the discussion without any edits. This is divided into two parts, the top part is the answers from the Microsoft Experts and the bottom part is the questions from the audience. Answers from Microsoft: Karel Danihelka [MS] (Expert)[2010-3-30 12:2]: Hi everyone, my name is Karel Danihelka and I am developer in partner response team. Sing Wee [MS] (Expert)[2010-3-30 12:2]: Hi, I'm Sing Wee, part of the CoreOS/BSP Test Team. GLanger_MS (Expert)[2010-3-30 12:2]: Hi, I'm Glen Langer, program manager on the Core Team. Karel Danihelka [MS] (Expert)[2010-3-30 12:3]: Q: I need to implement hardware timers on my windows CE 6.0 device to trigger events at microsecond intervals. Where should i start? A: Until you are using CPU with GHz frequency your only chance is use interrupt handler and implement all funcionality there. But it will be really tricky and may reduce system performance. If period will be near to millisecond timeframe you can use normal thread wait for event pattern. Karel Danihelka [MS] (Expert)[2010-3-30 12:5]: Q: I want to partition my NAND Flash device. One partition to use for hive ragistry and the other for the apps and data. The only way to do it is programmatically or setting some registry values ? A: It need to be set in registry - generally you need mark this partition as boot partition. Karel Danihelka [MS] (Expert)[2010-3-30 12:7]: Q: My CPU is Intel celeron M processor 1Ghz. A: In this case you can try use normal approach - in interrupt handler return SYSINTR and start thread in device driver which will spin thread waiting on event attached to this SYSINTR. Karel Danihelka [MS] (Expert)[2010-3-30 12:7]: Q: If i need to implement it using interrupt handlers, What are all the files that I should look at? A: Good quesiton - I would recommend documentation and there was BSP development book to download for free. mikehall_ms (Moderator)[2010-3-30 12:8]: Q: Hi guys, what's the formal way to report bugs back to the core team / product team? The mechanism of calling the support phone number every time is really onerous and time-consuming. Is there another mechanism? A: Using product support is the formal way to report bugs/issues - Product support can then create an issue that can be tracked. Karel Danihelka [MS] (Expert)[2010-3-30 12:9]: Q: But the operation for creating the partitions ? A: This is tricky - if you will make it autopartition & autoformat it will be created by filesystem. But generally it depends on your boot loader. mikehall_ms (Moderator)[2010-3-30 12:10]: Q: Is Windows Phone 7 related to Windows CE? If so, can you tell me what version of Windows CE is the basis? Is it in fact the new version of Windows Mobile? A: At MIX 2010 Charlie Kindel presented a session that described some of the core technologies that make up Windows Phone 7 Series, including the underlying operating system (Windows CE) and the new ISV programming model based on Silverlight and .NET - check out the Mix Online Videos to get more information. davbo-msft (Moderator)[2010-3-30 12:10]: Q: Is Windows Phone 7 related to Windows CE? If so, can you tell me what version of Windows CE is the basis? Is it in fact the new version of Windows Mobile? A: This forum is to discussed released products in the industry. Windows Mobile & Windows CE are based on the same Windows CE Kernel/system. Windows CE is focused on deliverying the OS for embedded customers in the market where Windows Mobile is focused on deliverying compelling Windows Phone platform. davbo-msft (Moderator)[2010-3-30 12:11]: Q: Is Windows Phone 7 related to Windows CE? If so, can you tell me what version of Windows CE is the basis? Is it in fact the new version of Windows Mobile? A: http://en.wikipedia.org/wiki/Windows_CE Wikipedia gives a good breakdown of the version history. Travis Hobrla [MS] (Expert)[2010-3-30 12:13]: Q: I created a OS design with KITL and kernel debugger enabled. But I am unable to connect to the target for debugging. I am getting the following error when i try to connect with the device. PB Debugger Cannot initialize the Kernel Debugger. PB Debugger Debugger could not initialize connection. PB Debugger The Kernel Debugger is waiting to connect with target. PB Debugger The Kernel Debugger has been disconnected successfully. A: One possibility is that a rogue cesvchost.exe has co-opted the debugger. I am assuming this is CE 6.0? Can you try exiting visual studio and manually killing the cesvchost.exe process from the Task Manager? davbo-msft (Moderator)[2010-3-30 12:14]: Q: Hi guys, what's the formal way to report bugs back to the core team / product team? The mechanism of calling the support phone number every time is really onerous and time-consuming. Is there another mechanism? A: For info on contacting Microsoft support refer to the support page on the Embedded website: http://msdn.microsoft.com/en-us/windowsembedded/dd897633.aspx Sing Wee [MS] (Expert)[2010-3-30 12:16]: Q: Do u mean ISR/IST implementation? How can i register an interrupt? What kind of interrupt should i register? A: A good introduction to interrupts in WinCE 6.0 can be found here (aside from the documentation on MSDN): http://download.microsoft.com/download/9/c/f/9cffaa58-4000-48d6-a4b2-5fed9e4e6410/Chapter%206%20-%20Developing%20Device%20Drivers.pdf mikehall_ms (Moderator)[2010-3-30 12:16]: Q: What will be different in Windows Compact 7 from CE 6.0? A: Unfortunately we cannot discuss unreleased products on this chat - keep an eye on the Windows Embedded web site and blogs to keep up to date with product announcements. Travis Hobrla [MS] (Expert)[2010-3-30 12:16]: Q: I am using CE 6.0. There is no cesvchost process running in my system. A: What operating system are you using? Karel Danihelka [MS] (Expert)[2010-3-30 12:16]: Q: So...I have to modify file system code to create 2 partition at system startup ?!! I haven't understood.... A: You don't need to modify code, there are registry settings to achive this (look to documentation). But you may need to create partition table in boot loader. Unfortunatelly there isn't simple way how to do it. davbo-msft (Moderator)[2010-3-30 12:18]: Q: I would like to get a handle to a Silverlight screen section, is that possiable? A: Windows Embedded CE 6.0 R3 includes Sliverlight for Windows Embedded. Refer to New Features overview on the embedded web site. http://www.microsoft.com/windowsembedded/en-us/products/windowsce/default.mspx. Silverlight - The power of Silverlight brought to Windows Embedded CE to create rich applications and user interfaces is new part of Windows CE Embedded. mikehall_ms (Moderator)[2010-3-30 12:20]: Q: The link for developing device drivers is not working. can u please check that? A: http://msdn.microsoft.com/en-us/library/ms923714.aspx davbo-msft (Moderator)[2010-3-30 12:20]: Q: I would like to get a handle to a Silverlight screen section, is that possiable? A: Sorry misunderstood the question I thought you were asking if embedded CE could handle Silverlight. Please repost so that the question goes back into the active queue because once answered no way to put the status back to open. Travis Hobrla [MS] (Expert)[2010-3-30 12:21]: Q: sorry! Windows XP SP3 A: Can you try exiting VS2005 and confirming cesvchost.exe is not running, then renaming C:\Documents and Settings\USERNAME\Local Settings\Application Data\Microsoft\CoreCon\1.0 to 1.0_backup, then restarting VS2005? Sing Wee [MS] (Expert)[2010-3-30 12:24]: Q: Can I have the book's name please? A: I believe the downloadable version is related to the last link I sent. If you go to the following website, I believe you can download the whole thing: http://msdn.microsoft.com/en-us/windowsembedded/ce/cc294468.aspx davbo-msft (Moderator)[2010-3-30 12:24]: Q: If one has an image on a Silverlight page, it seems to be cached. How would one refresh that cache after changing the underlying image? A: change the URI of the image or use a writeable bitmap if they want to manually toggle the pixels Sing Wee [MS] (Expert)[2010-3-30 12:25]: A: Whoops, hit [ENTER] too early. On the right side, you'll see there an "Exam Preparation Kit" link that can be downloaded in several different languages. Sing Wee [MS] (Expert)[2010-3-30 12:25]: Q: Can I have the book's name please? A: Whoops, hit [ENTER] too early. On the right side, you'll see there an "Exam Preparation Kit" link that can be downloaded in several different languages. Karel Danihelka [MS] (Expert)[2010-3-30 12:26]: Q: I have a NAND Flash on my target device. On this flash I have the hive registry and an application.I have observed that when the NAND flash is fully, the system startup time is longer....is there a degradation of NAND use that influences the startup time ? Why ? A: Yes - on boot flash abstraction library (old one) read metadata from all sectors to rebuild physical - logical mapping table. davbo-msft (Moderator)[2010-3-30 12:27]: Q: I would like to get a handle to a Silverlight screen section, is that possiable? A: Need addition info on this question. Can you provide more details on what you are trying to do in Silverlight? davbo-msft (Moderator)[2010-3-30 12:27]: Q: If one has an image on a Silverlight page, it seems to be cached. How would one refresh that cache after changing the underlying image? A: Additonal Info: if you want to manually touch the pixels use WriteableBitmap if you want to use the underlying HWND then use IXRVisualHost::GetHWND() davbo-msft (Moderator)[2010-3-30 12:28]: Q: Writable bitmap, is there an example of the syntax? A: if you want to manually touch the pixels use WriteableBitmap if you want to use the underlying HWND then use IXRVisualHost::GetHWND() davbo-msft (Moderator)[2010-3-30 12:29]: Q: I would like to get a handle to a Silverlight screen section, is that possiable? A: Can I get more information about this question about what you are trying to accomplish in Silverlight? davbo-msft (Moderator)[2010-3-30 12:31]: Q: IXRVisualHost::GetHWND() exactly what I needed Thanks, A: Your welcome Sing Wee [MS] (Expert)[2010-3-30 12:31]: Q: ok. thanks for the book's link A: No problem. Travis Hobrla [MS] (Expert)[2010-3-30 12:32]: Q: Typically for SoC devices you name your hardware specific libraries in the form "SOCDIRNAME_LIBNAME". In our platform "OMAP35XX_TPS659XX_TI_V1" if you do this we cause the catalog parser to die... For example if we have a library "Musbfn_OMAP35XX_TPS659XX_TI_V1.dll" entering this in the catalogs pbcxml file in a <module> section causes the XML parser to fail with : Error 3 The 'urn:Microsoft.PlatformBuilder/Catalog:Module' element is invalid - The value '012345678901234567890123456789.dll' is invalid according to its datatype 'urn:Microsoft.PlatformBuilder/Catalog:CatalogFileName' - The actual length is greater than the MaxLength value. A: There are a couple workarounds I can think of. I believe the Module element is only used when doing SYSGEN parsing to make sure dependent SYSGENs are present when the item is selected, so I believe it is optional to the catalog. The other obvious workaround is to shorten the soc name. I realize neither of these solutions is ideal. This is not something we anticipated when we tested CE6.0, sorry. Travis Hobrla [MS] (Expert)[2010-3-30 12:33]: Q: I am getting this error only when I select the KdStub as the debugger in Target device connectivity. A: Right, but KdStub is the debugger that you should use. Have you tried the steps I suggested? Travis Hobrla [MS] (Expert)[2010-3-30 12:36]: Q: If I select Active KTIL, My OS doesn't boots. It says "loading NK.EXE at 0x<xxxxx> location" after that nothing comes in the debug log. A: Can you look at the serial debug output and see what is happening there? Often it can give you a clue to the KITL driver malfunctioning. Travis Hobrla [MS] (Expert)[2010-3-30 12:38]: Q: I have tried that and I am getting the same error. A: I am assuming you have a device created in Target -> Connectivity Options in Platform Builder. What are the Kernel download / Kernel transport for your device? Travis Hobrla [MS] (Expert)[2010-3-30 12:40]: Q: KITL: *** Device Name CEPC56059 *** WARN: KITL will run in polling mode VBridge:: built on [Jul 10 2009] time [10:20:14] VBridgeInit()...TX = [16384] bytes -- Rx = [16384] bytes Tx buffer [0xA1B84860] to [0xA1B88860]. Rx buffer [0xA1B88880] to [0xA1B8C880]. VBridge:: NK add MAC: [0-60-65-2-DA-FB] Connecting to Desktop KITL: Connected host IP: 1 Port: 1086 .. this is the output of the serial debug A: This looks reasonable and does not give clues as to why boot would halt at that point. If you capture a network trace or turn on KITL debug zones via dpCurSettings in kitl.dll, do you see KITL active after this? Travis Hobrla [MS] (Expert)[2010-3-30 12:41]: Q: Both is happening via Ethernet. A: Only thing I have left to suggest is a Platform Builder installation Repair, then. Karel Danihelka [MS] (Expert)[2010-3-30 12:42]: Q: Hi, I saw that the ATADISK is quite generic and des not have any optimizations. Do you have any advice to consider while tryin to improve the performance of it? A: If I remember correctly sample code has support for some specific hardware controllers (little obsolete now). This should be good start point (if you will not decide take existing driver as sample and write you own). Travis Hobrla [MS] (Expert)[2010-3-30 12:44]: Q: I didn't do that. I have to try. A: I think that's the next valid step. You need to figure out whether KITL is hanging or the device - use instrumented serial debug messages and network trace to determine this. Sing Wee [MS] (Expert)[2010-3-30 12:46]: Q: KITL: *** Device Name CEPC56059 *** WARN: KITL will run in polling mode VBridge:: built on [Jul 10 2009] time [10:20:14] VBridgeInit()...TX = [16384] bytes -- Rx = [16384] bytes Tx buffer [0xA1B84860] to [0xA1B88860]. Rx buffer [0xA1B88880] to [0xA1B8C880]. VBridge:: NK add MAC: [0-60-65-2-DA-FB] Connecting to Desktop KITL: Connected host IP: 1 Port: 1086 .. this is the output of the serial debug A: Neo, have you by any chance tried looking into your firewall to see if it might be blocking traffic on any particular ports? Wireshark/netmon might be able to help you here if that's the issue. davbo-msft (Moderator)[2010-3-30 12:48]: Q: I lost spell check, how can i get it back A: Hello - can you give additional details about your question? Is this related to a Windows CE Embedded application? masatos_MSFT (Expert)[2010-3-30 12:51]: Q: When attempting to run the CETK cellcore tests the documentation states the pre-requisites include "stinger.ini", "ltk.ini" but windows CE doesn't provide these or document what they fully need to contain. Implicitly you also need "datatrans.xml" which isn't supplied. If you get around this error and steal these from Windows Mobile instead, when you try and run the CETK tests you get a data abort in radiometricsdll.dll. How should we invoke the cellcore parts of CETK? A: Hi Pev, what version of Windows CE and CETK are you using? I do not have the expertise to answer this question, but can find somebody who can. Travis Hobrla [MS] (Expert)[2010-3-30 12:52]: Q: I don't see a kitlcore.dll in my OS. is my debug image fails to load because of that? A: kitl.dll should be all that's needed, kitlcore.lib is linked into that. Travis Hobrla [MS] (Expert)[2010-3-30 12:55]: Q: I've got a platform (not developed by myself) where I2C bus support has to be provided through the OAL as the kernel needs to talk to devices such as the power management IC and gas gauge so a 'proper' I2C driver hanging off device manager isn't possible. This happens to be a polled driver, so obviously it hits the system hard when either under lots of traffic or an error condition occurs and the driver constantly polls. I originally thought that there was no straightforward way to make such code interrupt driven in the kernel (as it's a cludge) but I realised that that's exactly what ETHDBG drivers do. Is there any reason why I shouldn't have a go at implementing a similar mechanism for our kernel resident I2C driver? If not, are there any obvious pitfalls - I've not seen any other BSP's do this in the past... A: You can make a 'proper' driver that calls down into the OAL to do the actual I2C transactions. Alternatively you can build an interrupt-based version in the OAL where you handle everything in the ISR. There is nothing wrong with that so long as the rest of your drivers and app threads can handle longer times with interrupts off while you are servicing I2C interrupts. Sing Wee [MS] (Expert)[2010-3-30 12:55]: Q: I am having trouble with my mouse, I have the microsoft wireless mobile mouse 3000, when I push the scroll button I am suppose to have autoscroll instead it shows other web pages,Can you help me out tell me what to do!!! A: Sorry, this current chat is about Windows Embedded Compact. Hope you're able to find an answer to your question elsewhere. davbo-msft (Moderator)[2010-3-30 12:56]: Q: Is the Silverlight Animation "Spline" a BezierSpline? A: Spline - http://msdn.microsoft.com/en-us/library/ee501495.aspx<BR< a>> masatos_MSFT (Expert)[2010-3-30 12:57]: Thanks for the info Pev. I will follow up with the CETK experts here and get back to you. davbo-msft (Moderator)[2010-3-30 12:59]: Q: Spline- bad link A: http://msdn.microsoft.com/en-us/library/ee501495.aspx davbo-msft (Moderator)[2010-3-30 13:0]: Q: Sorry, got to tirm the ">" A: No Worries http://msdn.microsoft.com/en-us/library/ee501495.aspx davbo-msft (Moderator)[2010-3-30 13:0]: Hello everyone, we are just about out of time. Thank you for joining us for our Windows Embedded CE 6.0 chat today! <http://www.Microsoft.com/Embedded>; A special thank you to the product group members for coming out. The transcript of today’s chat will be posted online as soon as possible, to <http://msdn.microsoft.com/en-us/chats>;. We’ll see you again for another chat next month. Please check <http://msdn.microsoft.com/en-us/chats>; for the list of upcoming chats. If you still have unanswered questions, let me suggest that you post them on one of our newsgroups on <http://msdn.microsoft.com/en-us/windowsembedded/ce/default.aspx> -Windows Embedded CE 6.0 R3 Now Available! <http://msdn.microsoft.com/windowsembedded/ce/dd630616.aspx>; davbo-msft (Moderator)[2010-3-30 13:1]: Q: hi everybody. I would like to know if there is something know about a bug in RTC API (VOIP), especially when using SIP. According the to the analysis with application verifyier there is a heap link in rtcdllmedia.dll. All of the unreleased chunks seem to have a size of 6560 bytes. A: I will follow up with the Networking Team for a response. davbo-msft (Moderator)[2010-3-30 13:1]: Q: Hi, we've problems with debugging of applications (= breakpoints in Platform Builder will be ignored) over KITL on Windows CE 5.0, if the PDB files are large (over 60MB). Are there any limitations to size of the PDB files? A: I will follow up with the tools team for a response and post with the transcript. Sing Wee [MS] (Expert)[2010-3-30 13:1]: Q: I am unable to use the target control in my development environment. any ideas? A: Make sure you have SYSGEN_SHELL=1 set in your build environment. davbo-msft (Moderator)[2010-3-30 13:3]: Q: what are the main differences between Object Store and RAM disk ? They are both in RAM...are there performance differences ? access differences ? A: I will follow up with the Core Team and get a response posted with the transcript to MSDN The Questions [2010-3-30 12:57]: Thanks for the info Pev. I will follow up with the CETK experts here and get back to you. [2010-3-30 12:59]: [2010-3-30 13:0]: [2010-3-30 13:0]: Hello everyone, we are just about out of time. Thank you for joining us for our Windows Embedded CE 6.0 chat today! <http://www.Microsoft.com/Embedded>; A special thank you to the product group members for coming out. The transcript of today’s chat will be posted online as soon as possible, to <http://msdn.microsoft.com/en-us/chats>;. We’ll see you again for another chat next month. Please check <http://msdn.microsoft.com/en-us/chats>; for the list of upcoming chats. If you still have unanswered questions, let me suggest that you post them on one of our newsgroups on <http://msdn.microsoft.com/en-us/windowsembedded/ce/default.aspx> -Windows Embedded CE 6.0 R3 Now Available! <http://msdn.microsoft.com/windowsembedded/ce/dd630616.aspx>; [2010-3-30 13:1]: [2010-3-30 13:1]: [2010-3-30 13:1]: [2010-3-30 13:3]: neo (Guest)[2010-3-30 11:37]: Hi all KellyG (Guest)[2010-3-30 11:37]: Hi KellyG (Guest)[2010-3-30 11:37]: I have a question unrelated to windows Ce embedded, can you please help me?? neo (Guest)[2010-3-30 11:38]: I need to implement hardware timers on my windows CE 6.0 device to trigger events at microsecond intervals! c neo (Guest)[2010-3-30 11:38]: yes. post it. May be i cud give a try KellyG (Guest)[2010-3-30 11:38]: My Product key listed on my tower is not the product key I need for microsoft office, but that is the only product key listed. neo (Guest)[2010-3-30 11:39]: I hope this is a chat for windows embedded. please post ur queries in office forums KellyG (Guest)[2010-3-30 11:39]: it is but i could not find a forum for office neo (Guest)[2010-3-30 11:40]: I think moderators will help u out. @ davbo-msft: can u help this guy? neo (Guest)[2010-3-30 11:41]: Q: I need to implement hardware timers on my windows CE 6.0 device to trigger events at microsecond intervals. Where should i start? davbo-msft (Moderator)[2010-3-30 11:50]: Our chat today covers the topic of Windows Embedded CE! 1. This chat will last for one hour. During this hour, our Experts will respond to as many questions as they can. Please understand that there may be some questions we cannot respond to due to lack of information or because the information is not yet public. 2. We encourage you to submit questions for our Experts. To do so, type your questions in the send box, select the “ask the Experts” box and click SEND. Questions sent directly to the Guest Chat room will not be answered by the Experts, but we encourage other community members to assist. 3. We ask that you stay on topic for the duration of the chat. This helps the Guests and Experts follow the conversation more easily. We invite you to ask off topic questions after this chat is over, but not during. 4. Please abide by the Chat Code of Conduct. Chat code of conduct: <http://msdn.microsoft.com/chats/chatroom.aspx?ctl=hlp#Conduct>; Pev (Guest)[2010-3-30 11:54]: Evening! davbo-msft (Moderator)[2010-3-30 11:54]: Hello everyone this is Dave Boyce - I worked in the Multimedia area for Windows CE. neo (Guest)[2010-3-30 11:55]: hello dave neo (Guest)[2010-3-30 11:55]: The chat code of conduct link is not working! Pev (Guest)[2010-3-30 11:56]: Best be polite just in case then ;-) neo (Guest)[2010-3-30 11:56]: davbo-msft (Moderator)[2010-3-30 11:57]: I'll check out the issue w/ the link paolopat (Guest)[2010-3-30 12:0]: Hello davbo-msft (Moderator)[2010-3-30 12:0]: We are pleased to welcome our Experts for today’s chat. I will have them introduce themselves now. Chat will begin in a couple of minutes. <http://www.Microsoft.com/Embedded>; paolopat (Guest)[2010-3-30 12:3]: Hello Experts ! neo (Guest)[2010-3-30 12:3]: Welcome all! paolopat (Guest)[2010-3-30 12:3]: Q: I want to partition my NAND Flash device. One partition to use for hive ragistry and the other for the apps and data. The only way to do it is programmatically or setting some registry values ? neo (Guest)[2010-3-30 12:3]: Q: I need to implement hardware timers on my windows CE 6.0 device to trigger events at microsecond intervals. Where should i start? neo (Guest)[2010-3-30 12:5]: Q: My CPU is Intel celeron M processor 1Ghz. Pev (Guest)[2010-3-30 12:5]: neo: if your silicon has multiple general purpose timers, pick one that's not in use for the system timer / profiler and set it up to trigger irqs for your purpose. You can't guarantee hard realtime type responses though... GarySwalling (Guest)[2010-3-30 12:5]: Q: Is Windows Phone 7 related to Windows CE? If so, can you tell me what version of Windows CE is the basis? Is it in fact the new version of Windows Mobile? Pev (Guest)[2010-3-30 12:6]: Q: Hi guys, what's the formal way to report bugs back to the core team / product team? The mechanism of calling the support phone number every time is really onerous and time-consuming. Is there another mechanism? paolopat (Guest)[2010-3-30 12:6]: Q: But the operation for creating the partitions ? neo (Guest)[2010-3-30 12:6]: Q: If i need to implement it using interrupt handlers, What are all the files that I should look at? GPM (Guest)[2010-3-30 12:6]: Q: I would like to get a handle to a Silverlight screen section, is that possiable? Jhony (Guest)[2010-3-30 12:7]: Q: I created a OS design with KITL and kernel debugger enabled. But I am unable to connect to the target for debugging. I am getting the following error when i try to connect with the device. PB Debugger Cannot initialize the Kernel Debugger. PB Debugger Debugger could not initialize connection. PB Debugger The Kernel Debugger is waiting to connect with target. PB Debugger The Kernel Debugger has been disconnected successfully. Charles (Guest)[2010-3-30 12:7]: What will be different in Windows Compact 7 from CE 6.0? neo (Guest)[2010-3-30 12:8]: Can I have the book's name please? kiefs_dev (Guest)[2010-3-30 12:8]: Q: hi everybody. I would like to know if there is something know about a bug in RTC API (VOIP), especially when using SIP. According the to the analysis with application verifyier there is a heap link in rtcdllmedia.dll. All of the unreleased chunks seem to have a size of 6560 bytes. paolopat (Guest)[2010-3-30 12:10]: Q: So...I have to modify file system code to create 2 partition at system startup ?!! I haven't understood.... neo (Guest)[2010-3-30 12:10]: Q: Do u mean ISR/IST implementation? How can i register an interrupt? What kind of interrupt should i register? neo (Guest)[2010-3-30 12:11]: Q: Can I have the book's name please? Charles (Guest)[2010-3-30 12:11]: Q: What will be different in Windows Compact 7 from CE 6.0? PaulT (Guest)[2010-3-30 12:11]: neo: I'd say that you really need the docs for YOUR BSP, not generic documents for BSPs in general. Each BSP may be architected differently. If you're using the CEPC BSP, then the documentation that comes with Platform Builder is a reasonable place to look. GPM (Guest)[2010-3-30 12:11]: Q: If one has an image on a Silverlight page, it seems to be cached. How would one refresh that cache after changing the underlying image? Elektrobit (Guest)[2010-3-30 12:12]: Q: Hi, we've problems with debugging of applications (= breakpoints in Platform Builder will be ignored) over KITL on Windows CE 5.0, if the PDB files are large (over 60MB). Are there any limitations to size of the PDB files? Jhony (Guest)[2010-3-30 12:15]: Q: I am using CE 6.0. There is no cesvchost process running in my system. alexquisi (Guest)[2010-3-30 12:15]: Q: Hi, I saw that the ATADISK is quite generic and des not have any optimizations. Do you have any advice to consider while tryin to improve the performance of it? Jhony (Guest)[2010-3-30 12:17]: Windows XP service pack 1 Jhony (Guest)[2010-3-30 12:17]: Q: sorry! Windows XP SP3 neo (Guest)[2010-3-30 12:19]: Q: The link for developing device drivers is not working. can u please check that? paolopat (Guest)[2010-3-30 12:20]: Q: I have a NAND Flash on my target device. On this flash I have the hive registry and an application.I have observed that when the NAND flash is fully, the system startup time is longer....is there a degradation of NAND use that influences the startup time ? Why ? Pev (Guest)[2010-3-30 12:20]: Q: When attempting to run the CETK cellcore tests the documentation states the pre-requisites include "stinger.ini", "ltk.ini" but windows CE doesn't provide these or document what they fully need to contain. Implicitly you also need "datatrans.xml" which isn't supplied. If you get around this error and steal these from Windows Mobile instead, when you try and run the CETK tests you get a data abort in radiometricsdll.dll. How should we invoke the cellcore parts of CETK? Pev (Guest)[2010-3-30 12:21]: Hi all, Pev (Guest)[2010-3-30 12:21]: oops Pev (Guest)[2010-3-30 12:21]: :-D Pev (Guest)[2010-3-30 12:24]: Typically for SoC devices you name your hardware specific libraries in the form "SOCDIRNAME_LIBNAME". In our platform "OMAP35XX_TPS659XX_TI_V1" if you do this we cause the catalog parser to die... For example if we have a library "Musbfn_OMAP35XX_TPS659XX_TI_V1.dll" entering this in the catalogs pbcxml file in a <module> section causes the XML parser to fail with : Pev (Guest)[2010-3-30 12:25]: Q: Error 3 The 'urn:Microsoft.PlatformBuilder/Catalog:Module' element is invalid - The value '012345678901234567890123456789.dll' is invalid according to its datatype 'urn:Microsoft.PlatformBuilder/Catalog:CatalogFileName' - The actual length is greater than the MaxLength value. GPM (Guest)[2010-3-30 12:25]: Q: Writable bitmap, is there an example of the syntax? Pev (Guest)[2010-3-30 12:25]: Q: Typically for SoC devices you name your hardware specific libraries in the form "SOCDIRNAME_LIBNAME". In our platform "OMAP35XX_TPS659XX_TI_V1" if you do this we cause the catalog parser to die... For example if we have a library "Musbfn_OMAP35XX_TPS659XX_TI_V1.dll" entering this in the catalogs pbcxml file in a <module> section causes the XML parser to fail with : Error 3 The 'urn:Microsoft.PlatformBuilder/Catalog:Module' element is invalid - The value '012345678901234567890123456789.dll' is invalid according to its datatype 'urn:Microsoft.PlatformBuilder/Catalog:CatalogFileName' - The actual length is greater than the MaxLength value. Pev (Guest)[2010-3-30 12:25]: sorry, messed up submission there! GPM (Guest)[2010-3-30 12:26]: Q: I would like to get a handle to a Silverlight screen section, is that possiable? GPM (Guest)[2010-3-30 12:28]: Q: IXRVisualHost::GetHWND() exactly what I needed Thanks, PaulT (Guest)[2010-3-30 12:29]: GPM: You don't have to keep submitting the questions. The chat experts have an application that they're using to follow the chat and all Ask the Experts questions are logged. Jhony (Guest)[2010-3-30 12:29]: Q: I am getting this error only when I select the KdStub as the debugger in Target device connectivity. neo (Guest)[2010-3-30 12:30]: Q: ok. thanks for the book's link Pev (Guest)[2010-3-30 12:31]: Hm, did those two I submitted get picked up by anyone? neo (Guest)[2010-3-30 12:33]: Q: If I select Active KTIL, My OS doesn't boots. It says "loading NK.EXE at 0x<xxxxx> location" after that nothing comes in the debug log. PaulT (Guest)[2010-3-30 12:34]: Pev: PaulT (Guest)[2010-3-30 12:35]: Pev: I'm sure they did. The guys who are actually on the chat may not be experts in that part of things. That's usually the explanation when you don't get an answer in 10 minutes or so. Pev (Guest)[2010-3-30 12:36]: Ah, fair enough Susie (Guest)[2010-3-30 12:36]: My Outlook Express incoming mail is corrput. No ONE has been able to fix the problem, Dell or Norton. I have dial up I'm in a rural area Jhony (Guest)[2010-3-30 12:37]: Q: I have tried that and I am getting the same error. Pev (Guest)[2010-3-30 12:37]: Susie : Use Thunderbird instead :-D PaulT (Guest)[2010-3-30 12:37]: Susie: Sorry, but this chat is not about Windows, but Embedded (like what runs on a phone). Your best chance is to find a local expert or talk to your ISP. paolopat (Guest)[2010-3-30 12:37]: Q: what are the main differences between Object Store and RAM disk ? They are both in RAM...are there performance differences ? access differences ? Susie (Guest)[2010-3-30 12:38]: My computer knowlege is very limited, what is Thunderbird? Pev (Guest)[2010-3-30 12:38]: A different email client :-D neo (Guest)[2010-3-30 12:39]: Q: KITL: *** Device Name CEPC56059 *** WARN: KITL will run in polling mode VBridge:: built on [Jul 10 2009] time [10:20:14] VBridgeInit()...TX = [16384] bytes -- Rx = [16384] bytes Tx buffer [0xA1B84860] to [0xA1B88860]. Rx buffer [0xA1B88880] to [0xA1B8C880]. VBridge:: NK add MAC: [0-60-65-2-DA-FB] Connecting to Desktop KITL: Connected host IP: 1 Port: 1086 .. this is the output of the serial debug Susie (Guest)[2010-3-30 12:39]: Do I need to uninstall Outlook Express youngboyzie (Guest)[2010-3-30 12:39]: I need to start battery calibration for my new battery for my dell inspiron 1525 laptop and should be able to reach the BIOS screen by hitting f2 but this isnt working... help? Pev (Guest)[2010-3-30 12:39]: Nah, you can run it instead - you'll still need help from your ISP to configure it I expect Jhony (Guest)[2010-3-30 12:39]: Q: Both is happening via Ethernet. PaulT (Guest)[2010-3-30 12:40]: youngboyzie: You're off-topic. This is not a general chat for Windows and certainly not for Dell. You'll have to ask Dell how to get to setup; it's their machine. bill (Guest)[2010-3-30 12:41]: I lost spell check, how can i get et back neo (Guest)[2010-3-30 12:42]: Q: I didn't do that. I have to try. Jhony (Guest)[2010-3-30 12:43]: Q: Ok. I will do it then. bill (Guest)[2010-3-30 12:43]: Q: I lost spell check, how can i get it back PaulT (Guest)[2010-3-30 12:44]: bill: This isn't a general Windows chat. There are some Web forums that you might try. GarySwalling (Guest)[2010-3-30 12:45]: Q: Thanks, I found the Phone 7 presentation at http://live.visitmix.com/MIX10/Sessions/CL13 GPM (Guest)[2010-3-30 12:45]: Q: Is the Silverlight Animation "Spline" a BezierSpline? neo (Guest)[2010-3-30 12:46]: Q: ok. I'll do it. thanks Pev (Guest)[2010-3-30 12:46]: Whowever was asking about KITL connection : I've had this loads in the past. I think I started debugging last time by using wireshark to see what was happening on the network then setting up the OAL_ETHER and OAL_FUNC and OAL_VERBOSE as well as OAL_KITL flags to see what was actually happening in the driver.... Pev (Guest)[2010-3-30 12:47]: I'd generally make sure that you're testing though a 10baseT hub (instead of anything faster) and forcing Active KITL in polled mode too... neo (Guest)[2010-3-30 12:48]: Q: I disabled the firewall in my PC. Pev (Guest)[2010-3-30 12:50]: Q: I've got a platform (not developed by myself) where I2C bus support has to be provided through the OAL as the kernel needs to talk to devices such as the power management IC and gas gauge so a 'proper' I2C driver hanging off device manager isn't possible. This happens to be a polled driver, so obviously it hits the system hard when either under lots of traffic or an error condition occurs and the driver constantly polls. I originally thought that there was no straightforward way to make such code interrupt driven in the kernel (as it's a cludge) but I realised that that's exactly what ETHDBG drivers do. Is there any reason why I shouldn't have a go at implementing a similar mechanism for our kernel resident I2C driver? If not, are there any obvious pitfalls - I've not seen any other BSP's do this in the past... neo (Guest)[2010-3-30 12:51]: Q: I don't see a kitlcore.dll in my OS. is my debug image fails to load because of that? Pev (Guest)[2010-3-30 12:52]: Q: Hi masatos, I'm using Windows Embedded CE 6.0 with R3 and patched to feb 2010's QFE's (with it's associated CETK version) this is a machine with only CE 6.0 on (no conflicts with earlier CE or WM...) neo (Guest)[2010-3-30 12:54]: ok. Got it neo (Guest)[2010-3-30 12:54]: Q: ok. Got it Roundman (Guest)[2010-3-30 12:55]: Q: I am having trouble with my mouse, I have the microsoft wireless mobile mouse 3000, when I push the scroll button I am suppose to have autoscroll instead it shows other web pages,Can you help me out tell me what to do!!! Pev (Guest)[2010-3-30 12:55]: Hey neo, debugging kitl issues is really frustrating but dont lose heart :-) neo (Guest)[2010-3-30 12:55]: @ pev : u fixed the problem of KITL after that? neo (Guest)[2010-3-30 12:56]: I am getting the same error again and again. I even cleaned my environment and tried in a fresh PC. But didn't succeed yet Pev (Guest)[2010-3-30 12:56]: Well, eventually - my experiences probably won't help you as different platforms have different reasons for doing that neo (Guest)[2010-3-30 12:56]: I think so PaulT (Guest)[2010-3-30 12:58]: neo: Have you searched the old messages in microsoft.public.windowsce.platbuilder? It seems to me that there was a packet size situation where it was possible to have problems with KITL connections based on a setting on the PC. Google Groups, groups.google.com, Advanced Groups Search will allow you to search a single newsgroup or a set of newsgroups easily. GPM (Guest)[2010-3-30 12:58]: Q: Spline- bad link PaulT (Guest)[2010-3-30 12:58]: GPM: without the > at the end does it work? It seems to for me... neo (Guest)[2010-3-30 12:58]: But some times if i try to connect to the device again. The Image information is seen in the serial debug. what does that mean?Download BIN file information: ----------------------------------------------------- [0]: Base Address=0x220000 Length=0x18DAADC Received a broadcast message !CheckUDP: Not UDP (proto = 0x00000001) after this i am getting the old errors. PB debugger cannot initialize ... GPM (Guest)[2010-3-30 12:59]: Q: Sorry, got to tirm the ">" neo (Guest)[2010-3-30 12:59]: ok paul. I ll look into that. davbo-msft (Moderator)[2010-3-30 13:0]: Hello everyone, we are just about out of time. Thank you for joining us for our Windows Embedded CE 6.0 chat today! <http://www.Microsoft.com/Embedded>; A special thank you to the product group members for coming out. The transcript of today’s chat will be posted online as soon as possible, to <http://msdn.microsoft.com/en-us/chats>;. We’ll see you again for another chat next month. Please check <http://msdn.microsoft.com/en-us/chats>; for the list of upcoming chats. If you still have unanswered questions, let me suggest that you post them on one of our newsgroups on <http://msdn.microsoft.com/en-us/windowsembedded/ce/default.aspx> -Windows Embedded CE 6.0 R3 Now Available! <http://msdn.microsoft.com/windowsembedded/ce/dd630616.aspx>; neo (Guest)[2010-3-30 13:1]: Q: I am unable to use the target control in my development environment. any ideas? Pev (Guest)[2010-3-30 13:1]: Sure, if KITL isn't connected target control won't work as it runs over kitl... neo (Guest)[2010-3-30 13:1]: ok .thanks pev neo (Guest)[2010-3-30 13:2]: yes. sysgen_shell is set to 1 neo (Guest)[2010-3-30 13:2]: Q: yes. sysgen_shell is set to 1 Marcelovk (Guest)[2010-3-30 13:2]: Q: Is there any way to extract the default command lines of the tests in CETK? I want to have it running unconnected from the desktop. Copyright © 2010 – Bruce Eitman All Rights Reserved

Read the article
What are the common issues that can cause slow boot times of Windows CE6 Images?

- by Psychic

I am relatively new to Platform Builder, and whilst I am able to produce nk.bin files, they boot very slowly, 80-100 seconds, so I think there may be some checkbox somewhere that I need to set (or clear)! I've already removed kitl, profiling, etc in the project settings, and set the project to 'release build' & 'ship'. When I looked at the startup event log (in debug), there doesn't appear to be any specific point where it is slow. The log pretty much scrolls all the way through with no major pauses. One thing I found strange was that although the nk.bin file was a lot smaller in release build (just under 12Mb), the boot time didn't noticeably change from the debug build... The board is a Vortex86DX_60A and I'm building CE6. Are there any 'common builder mistakes' that I may be missing here, or is this going to be something a little deeper?

Read the article
python: what are efficient techniques to deal with deeply nested data in a flexible manner?

- by AlexandreS

My question is not about a specific code snippet but more general, so please bear with me: How should I organize the data I'm analyzing, and which tools should I use to manage it? I'm using python and numpy to analyse data. Because the python documentation indicates that dictionaries are very optimized in python, and also due to the fact that the data itself is very structured, I stored it in a deeply nested dictionary. Here is a skeleton of the dictionary: the position in the hierarchy defines the nature of the element, and each new line defines the contents of a key in the precedent level: [AS091209M02] [AS091209M01] [AS090901M06] ... [100113] [100211] [100128] [100121] [R16] [R17] [R03] [R15] [R05] [R04] [R07] ... [1263399103] ... [ImageSize] [FilePath] [Trials] [Depth] [Frames] [Responses] ... [N01] [N04] ... [Sequential] [Randomized] [Ch1] [Ch2] Edit: To explain a bit better my data set: [individual] ex: [AS091209M02] [imaging session (date string)] ex: [100113] [Region imaged] ex: [R16] [timestamp of file] ex [1263399103] [properties of file] ex: [Responses] [regions of interest in image ] ex [N01] [format of data] ex [Sequential] [channel of acquisition: this key indexes an array of values] ex [Ch1] The type of operations I perform is for instance to compute properties of the arrays (listed under Ch1, Ch2), pick up arrays to make a new collection, for instance analyze responses of N01 from region 16 (R16) of a given individual at different time points, etc. This structure works well for me and is very fast, as promised. I can analyze the full data set pretty quickly (and the dictionary is far too small to fill up my computer's ram : half a gig). My problem comes from the cumbersome manner in which I need to program the operations of the dictionary. I often have stretches of code that go like this: for mk in dic.keys(): for rgk in dic[mk].keys(): for nk in dic[mk][rgk].keys(): for ik in dic[mk][rgk][nk].keys(): for ek in dic[mk][rgk][nk][ik].keys(): #do something which is ugly, cumbersome, non reusable, and brittle (need to recode it for any variant of the dictionary). I tried using recursive functions, but apart from the simplest applications, I ran into some very nasty bugs and bizarre behaviors that caused a big waste of time (it does not help that I don't manage to debug with pdb in ipython when I'm dealing with deeply nested recursive functions). In the end the only recursive function I use regularly is the following: def dicExplorer(dic, depth = -1, stp = 0): '''prints the hierarchy of a dictionary. if depth not specified, will explore all the dictionary ''' if depth - stp == 0: return try : list_keys = dic.keys() except AttributeError: return stp += 1 for key in list_keys: else: print '+%s> [\'%s\']' %(stp * '---', key) dicExplorer(dic[key], depth, stp) I know I'm doing this wrong, because my code is long, noodly and non-reusable. I need to either use better techniques to flexibly manipulate the dictionaries, or to put the data in some database format (sqlite?). My problem is that since I'm (badly) self-taught in regards to programming, I lack practical experience and background knowledge to appreciate the options available. I'm ready to learn new tools (SQL, object oriented programming), whatever it takes to get the job done, but I am reluctant to invest my time and efforts into something that will be a dead end for my needs. So what are your suggestions to tackle this issue, and be able to code my tools in a more brief, flexible and re-usable manner?

Read the article
DSP - Problems using the inverse Fast Fourier Transform

- by Trap

I've been playing around a little with the Exocortex implementation of the FFT, but I'm having some problems. First, after calculating the inverse FFT of an unchanged frequency spectrum obtained by a previous forward FFT, one would expect to get the original signal back, but this is not the case. I had to figure out that I needed to scale the FFT output to about 1 / fftLength to get the amplitudes ok. Why is this? Second, whenever I modify the amplitudes of the frequency bins before calling the iFFT the signal gets distorted at low frequencies. However, this does not happen if I attenuate all the bins by the same factor. Let me put a very simplified example of the output buffer of a 4-sample FFT: // Bin 0 (DC) FFTOut[0] = 0.0000610351563 FFTOut[1] = 0.0 // Bin 1 FFTOut[2] = 0.000331878662 FFTOut[3] = 0.000629425049 // Central bin FFTOut[4] = -0.0000381469727 FFTOut[5] = 0.0 // Bin 3, this is a negative frequency bin. FFTOut[6] = 0.000331878662 FFTOut[7] = -0.000629425049 The output is composed of pairs of floats, each representing the real and imaginay parts of a single bin. So, bin 0 (array indexes 0, 1) would represent the real and imaginary parts of the DC frequency. As you can see, bins 1 and 3 both have the same values, (except for the sign of the Im part), so I guess these are the negative frequency values, and finally indexes (4, 5) would be the central frequency bin. To attenuate the frequency bin 1 this is what I do: // Attenuate the 'positive' bin FFTOut[2] *= 0.5; FFTOut[3] *= 0.5; // Attenuate its corresponding negative bin. FFTOut[6] *= 0.5; FFTOut[7] *= 0.5; For the actual tests I'm using a 1024-length FFT and I always provide all the samples so no 0-padding is needed. // Attenuate var halfSize = fftWindowLength / 2; float leftFreq = 0f; float rightFreq = 22050f; for( var c = 1; c < halfSize; c++ ) { var freq = c * (44100d / halfSize); // Calc. positive and negative frequency locations. var k = c * 2; var nk = (fftWindowLength - c) * 2; // This kind of attenuation corresponds to a high-pass filter. // The attenuation at the transition band is linearly applied, could // this be the cause of the distortion of low frequencies? var attn = (freq < leftFreq) ? 0 : (freq < rightFreq) ? ((freq - leftFreq) / (rightFreq - leftFreq)) : 1; mFFTOut[ k ] *= (float)attn; mFFTOut[ k + 1 ] *= (float)attn; mFFTOut[ nk ] *= (float)attn; mFFTOut[ nk + 1 ] *= (float)attn; } Obviously I'm doing something wrong but can't figure out what or where.

Read the article
DSP - Filtering in the frequency domain via FFT

- by Trap

I've been playing around a little with the Exocortex implementation of the FFT, but I'm having some problems. Whenever I modify the amplitudes of the frequency bins before calling the iFFT the resulting signal contains some clicks and pops, especially when low frequencies are present in the signal (like drums or basses). However, this does not happen if I attenuate all the bins by the same factor. Let me put an example of the output buffer of a 4-sample FFT: // Bin 0 (DC) FFTOut[0] = 0.0000610351563 FFTOut[1] = 0.0 // Bin 1 FFTOut[2] = 0.000331878662 FFTOut[3] = 0.000629425049 // Bin 2 FFTOut[4] = -0.0000381469727 FFTOut[5] = 0.0 // Bin 3, this is the first and only negative frequency bin. FFTOut[6] = 0.000331878662 FFTOut[7] = -0.000629425049 The output is composed of pairs of floats, each representing the real and imaginay parts of a single bin. So, bin 0 (array indexes 0, 1) would represent the real and imaginary parts of the DC frequency. As you can see, bins 1 and 3 both have the same values, (except for the sign of the Im part), so I guess bin 3 is the first negative frequency, and finally indexes (4, 5) would be the last positive frequency bin. Then to attenuate the frequency bin 1 this is what I do: // Attenuate the 'positive' bin FFTOut[2] *= 0.5; FFTOut[3] *= 0.5; // Attenuate its corresponding negative bin. FFTOut[6] *= 0.5; FFTOut[7] *= 0.5; For the actual tests I'm using a 1024-length FFT and I always provide all the samples so no 0-padding is needed. // Attenuate var halfSize = fftWindowLength / 2; float leftFreq = 0f; float rightFreq = 22050f; for( var c = 1; c < halfSize; c++ ) { var freq = c * (44100d / halfSize); // Calc. positive and negative frequency indexes. var k = c * 2; var nk = (fftWindowLength - c) * 2; // This kind of attenuation corresponds to a high-pass filter. // The attenuation at the transition band is linearly applied, could // this be the cause of the distortion of low frequencies? var attn = (freq < leftFreq) ? 0 : (freq < rightFreq) ? ((freq - leftFreq) / (rightFreq - leftFreq)) : 1; // Attenuate positive and negative bins. mFFTOut[ k ] *= (float)attn; mFFTOut[ k + 1 ] *= (float)attn; mFFTOut[ nk ] *= (float)attn; mFFTOut[ nk + 1 ] *= (float)attn; } Obviously I'm doing something wrong but can't figure out what. I don't want to use the FFT output as a means to generate a set of FIR coefficients since I'm trying to implement a very basic dynamic equalizer. What's the correct way to filter in the frequency domain? what I'm missing? Thanks in advance.

Read the article
On counting pairs of words that differ by one letter

- by Quintofron

Let us consider n words, each of length k. Those words consist of letters over an alphabet (whose cardinality is n) with defined order. The task is to derive an O(nk) algorithm to count the number of pairs of words that differ by one position (no matter which one exactly, as long as it's only a single position). For instance, in the following set of words (n = 5, k = 4): abcd, abdd, adcb, adcd, aecd there are 5 such pairs: (abcd, abdd), (abcd, adcd), (abcd, aecd), (adcb, adcd), (adcd, aecd). So far I've managed to find an algorithm that solves a slightly easier problem: counting the number of pairs of words that differ by one GIVEN position (i-th). In order to do this I swap the letter at the ith position with the last letter within each word, perform a Radix sort (ignoring the last position in each word - formerly the ith position), linearly detect words whose letters at the first 1 to k-1 positions are the same, eventually count the number of occurrences of each letter at the last (originally ith) position within each set of duplicates and calculate the desired pairs (the last part is simple). However, the algorithm above doesn't seem to be applicable to the main problem (under the O(nk) constraint) - at least not without some modifications. Any idea how to solve this?

Read the article
Multi Pivoting on single Source data

- by Nev_Rahd

I am trying to mutlipivot source data (as below ) want results as single column (as below) My query so far is SELECT * FROM ( SELECT * FROM ( SELECT NK, DC, VERSION, GEV FROM MULTIPIVOT ) SRC PIVOT ( MAX(GEV) FOR DC IN ( [10], [11], [12], [18] ) ) AS PVT ) SRC PIVOT ( MAX([18]) FOR VERSION IN ( [2006], [2007], [2008],[2009] ) )AS PVT which outputs results as what is the way to get this as single row? Thanks

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
Introducing the New Boot Framework in CE 7

- by Kate Moss' Open Space

CE 7 introduces a new boot loader framework, BLDR (platform\common\src\common\bldr\). Some people like its powerful and flexbility, others may feel its too complicate as a boot loader framework. Despite to the favor, it is already there; so let's take a look at its features. Unlike the previous BL framwork (CE7 still provides it in platform\common\src\common\boot\) is a monolithic library, the new framework has more architecture structure. It not only defines main body but also provides rich components, such as filesystem (BinFS/FAT), download transportations, display, logging and block devices: bios INT13, FAL, IDE, Flash ( and etc. Note that in the block device category, the FAL is for legacy FMD/FAL, Flash is for latest MSFlash. Some of you may have encountered MSFlash MDD/PDD compatible partition is hard to created in bootloader and now it provides a clean solution! (Since this is a big topic, I will introduce it in future post) Today, I am going to show you some basic helper components - Image Loading functions. When OS image stored in the block device, it can be a file format, says your NK.BIN in the FAT volume or a RAW format, says the image is programmed to a BINFS partition. For the first one you can use BootFileSystemReadBinFile (platform\common\src\common\bldr\fileSystem\utils\fileSystemReadBinFile.c) and use BootBlockLoadBinFsImage (platform\common\src\common\bldr\block\utils\loadBinFs.c) to load from a partition. Need a sample code? No problem, the BootLoaderLoadOs in platform\cepc\src\boot\bldr\loados.c just provide a perfect example.

Read the article
Test of procedure is fine but when called from a menu gives uninitialized errors. C

- by Delfic

The language is portuguese, but I think you get the picture. My main calls only the menu function (the function in comment is the test which works). In the menu i introduce the option 1 which calls the same function. But there's something wrong. If i test it solely on the input: (1/1)x^2 //it reads the polinomyal (2/1) //reads the rational and returns 4 (you can guess what it does, calculates the value of an instace of x over a rational) My polinomyals are linear linked lists with a coeficient (rational) and a degree (int) int main () { menu_interactivo (); // instanciacao (); return 0; } void menu_interactivo(void) { int i; do{ printf("1. Instanciacao de um polinomio com um escalar\n"); printf("2. Multiplicacao de um polinomio por um escalar\n"); printf("3. Soma de dois polinomios\n"); printf("4. Multiplicacao de dois polinomios\n"); printf("5. Divisao de dois polinomios\n"); printf("0. Sair\n"); scanf ("%d", &i); switch (i) { case 0: exit(0); break; case 1: instanciacao (); break; case 2: multiplicacao_esc (); break; case 3: somar_pol (); break; case 4: multiplicacao_pol (); break; case 5: divisao_pol (); break; default:printf("O numero introduzido nao e valido!\n"); } } while (i != 0); } When i call it with the menu, with the same input, it does not stop reading the polinomyal (I know this because it does not ask me for the rational as on the other example) I've run it with valgrind --track-origins=yes returning the following: ==17482== Memcheck, a memory error detector ==17482== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==17482== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info ==17482== Command: ./teste ==17482== 1. Instanciacao de um polinomio com um escalar 2. Multiplicacao de um polinomio por um escalar 3. Soma de dois polinomios 4. Multiplicacao de dois polinomios 5. Divisao de dois polinomios 0. Sair 1 Introduza um polinomio na forma (n0/d0)x^e0 + (n1/d1)x^e1 + ... + (nk/dk)^ek, com ei > e(i+1): (1/1)x^2 ==17482== Conditional jump or move depends on uninitialised value(s) ==17482== at 0x401126: simplifica_f (fraccoes.c:53) ==17482== by 0x4010CB: le_f (fraccoes.c:30) ==17482== by 0x400CDA: le_pol (polinomios.c:156) ==17482== by 0x400817: instanciacao (t4.c:14) ==17482== by 0x40098C: menu_interactivo (t4.c:68) ==17482== by 0x4009BF: main (t4.c:86) ==17482== Uninitialised value was created by a stack allocation ==17482== at 0x401048: le_f (fraccoes.c:19) ==17482== ==17482== Conditional jump or move depends on uninitialised value(s) ==17482== at 0x400D03: le_pol (polinomios.c:163) ==17482== by 0x400817: instanciacao (t4.c:14) ==17482== by 0x40098C: menu_interactivo (t4.c:68) ==17482== by 0x4009BF: main (t4.c:86) ==17482== Uninitialised value was created by a stack allocation ==17482== at 0x401048: le_f (fraccoes.c:19) ==17482== I will now give you the functions which are called void le_pol (pol *p) { fraccao f; int e; char c; printf ("Introduza um polinomio na forma (n0/d0)x^e0 + (n1/d1)x^e1 + ... + (nk/dk)^ek,\n"); printf("com ei > e(i+1):\n"); *p = NULL; do { le_f (&f); getchar(); getchar(); scanf ("%d", &e); if (f.n != 0) *p = add (*p, f, e); c = getchar (); if (c != '\n') { getchar(); getchar(); } } while (c != '\n'); } void instanciacao (void) { pol p1; fraccao f; le_pol (&p1); printf ("Insira uma fraccao na forma (n/d):\n"); le_f (&f); escreve_f(inst_esc_pol(p1, f)); } void le_f (fraccao *f) { int n, d; getchar (); scanf ("%d", &n); getchar (); scanf ("%d", &d); getchar (); assert (d != 0); *f = simplifica_f(cria_f(n, d)); } simplifica_f simplifies a rational and cria_f creates a rationa given the numerator and the denominator Can someone help me please? Thanks in advance. If you want me to provide some tests, just post it. See ya.

Read the article
Statically Compiling Qt 4.6.2

- by geeko

This what I did but it results in errors: 1: In win32-msvc2008\qmake.conf I set QMAKE_CFLAGS_RELEASE = -O1 -Og -GL -MD 2: From MSVC2008 CMD I run vcvarsall.bat x86 and vcvars32.bat "C:\Program Files\Microsoft Visual Studio 9.0\VC\bin 3: From Qt 4.6.2 CMD I run the following C:\Qt\4.6.2configure -release -nomake examples -nomake demos -no-exceptions -n o-stl -no-rtti -no-qt3support -no-scripttools -no-openssl -no-opengl -no-webkit -no-phonon -no-style-motif -no-style-cde -no-style-cleanlooks -no-style-plastique -no-sql-sqlite -platform win32-msvc2008 -static -qt-libjpeg -qt-zlib -qt-libpng and then nmake However, I ended up every time with these errors: link /LIBPATH:"c:\Qt\4.6.2\lib" /LIBPATH:"c:\Qt\4.6.2\lib" /NOLOGO /INCR EMENTAL:NO /MANIFEST /MANIFESTFILE:"tmp\obj\release_static\assistant_adp.interme diate.manifest" /SUBSYSTEM:WINDOWS "/MANIFESTDEPENDENCY:type='win32' name='Micro soft.Windows.Common-Controls' version='6.0.0.0' publicKeyToken='6595b64144ccf1df ' language='' processorArchitecture=''" /OUT:......\bin\assistant_adp.exe @C :\DOCUME~1\Geeko\LOCALS~1\Temp\nm3F8.tmp fontpanel.obj : MSIL .netmodule or module compiled with /GL found; restarting li nk with /LTCG; add /LTCG to the link command line to improve linker performance main.obj : error LNK2001: unresolved external symbol "class QObject * __cdecl qt _plugin_instance_qjpeg(void)" (?qt_plugin_instance_qjpeg@@YAPAVQObject@@XZ) ......\bin\assistant_adp.exe : fatal error LNK1120: 1 unresolved externals NMAKE : fatal error U1077: '"C:\Program Files\Microsoft Visual Studio 9.0\VC\BIN \link.EXE"' : return code '0x460' Stop. NMAKE : fatal error U1077: '"C:\Program Files\Microsoft Visual Studio 9.0\VC\BIN \nmake.exe"' : return code '0x2' Stop. NMAKE : fatal error U1077: 'cd' : return code '0x2' Stop. NMAKE : fatal error U1077: 'cd' : return code '0x2' Stop. NMAKE : fatal error U1077: 'cd' : return code '0x2' Stop. Thank you in deed.

Read the article
Merging k sorted linked lists - analysis

- by Kotti

Hi! I am thinking about different solutions for one problem. Assume we have K sorted linked lists and we are merging them into one. All these lists together have N elements. The well known solution is to use priority queue and pop / push first elements from every lists and I can understand why it takes O(N log K) time. But let's take a look at another approach. Suppose we have some MERGE_LISTS(LIST1, LIST2) procedure, that merges two sorted lists and it would take O(T1 + T2) time, where T1 and T2 stand for LIST1 and LIST2 sizes. What we do now generally means pairing these lists and merging them pair-by-pair (if the number is odd, last list, for example, could be ignored at first steps). This generally means we have to make the following "tree" of merge operations: N1, N2, N3... stand for LIST1, LIST2, LIST3 sizes O(N1 + N2) + O(N3 + N4) + O(N5 + N6) + ... O(N1 + N2 + N3 + N4) + O(N5 + N6 + N7 + N8) + ... O(N1 + N2 + N3 + N4 + .... + NK) It looks obvious that there will be log(K) of these rows, each of them implementing O(N) operations, so time for MERGE(LIST1, LIST2, ... , LISTK) operation would actually equal O(N log K). My friend told me (two days ago) it would take O(K N) time. So, the question is - did I f%ck up somewhere or is he actually wrong about this? And if I am right, why doesn't this 'divide&conquer' approach can't be used instead of priority queue approach?

Read the article
Override an IOCTL Handler in PQOAL

- by Kate Moss' Big Fan

When porting or creating a BSP to a new platform, we often need to make change to OEMIoControl or HAL IOCTL handler for more specific. Since Microsoft introduced PQOAL in CE 5.0 and more and more BSP today leverages PQOAL to simplify the OAL, we no longer define the OEMIoControl directly. It is somehow analogous to migrate from pure Windows SDK to MFC; people starts to define those MFC handlers and forgot the WinMain and the big message loop. If you ever take a look at the interface between OAL and Kernel, PUBLIC\COMMON\OAK\INC\oemglobal.h, the pfnOEMIoctl is still there just as the entry point of Windows Program is WinMain since day one. (For those may argue about pfnOEMIoctl is not OEMIoControl, I will encourage you to dig into PRIVATE\WINCEOS\COREOS\NK\OEMMAIN\oemglobal.c which initialized pfnOEMIoctl to OEMIoControl. The interface is just to split OAL and Kernel which no longer linked to one executable file in CE 6, all of the function signature is still identical) So let's trace into PQOAL to realize how it implements OEMIoControl and how can we override an IOCTL handler we interest. First thing to know is the entry point (just as finding the WinMain in MFC), OEMIoControl is defined in PLATFORM\COMMON\SRC\COMMON\IOCTL\ioctl.c. Basically, it does nothing special but scan a pre-defined IOCTL table, g_oalIoCtlTable, and then execute the handler. (The highlight part) Other than that is just for error handling and the use of critical section to serialize the function. BOOL OEMIoControl( DWORD code, VOID *pInBuffer, DWORD inSize, VOID *pOutBuffer, DWORD outSize, DWORD *pOutSize ) { BOOL rc = FALSE; UINT32 i; ... // Search the IOCTL table for the requested code. for (i = 0; g_oalIoCtlTable[i].pfnHandler != NULL; i++) { if (g_oalIoCtlTable[i].code == code) break; } // Indicate unsupported code if (g_oalIoCtlTable[i].pfnHandler == NULL) { NKSetLastError(ERROR_NOT_SUPPORTED); OALMSG(OAL_IOCTL, ( L"OEMIoControl: Unsupported Code 0x%x - device 0x%04x func %d\r\n", code, code >> 16, (code >> 2)&0x0FFF )); goto cleanUp; } // Take critical section if required (after postinit & no flag) if ( g_ioctlState.postInit && (g_oalIoCtlTable[i].flags & OAL_IOCTL_FLAG_NOCS) == 0 ) { // Take critical section EnterCriticalSection(&g_ioctlState.cs); } // Execute the handler rc = g_oalIoCtlTable[i].pfnHandler( code, pInBuffer, inSize, pOutBuffer, outSize, pOutSize ); // Release critical section if it was taken above if ( g_ioctlState.postInit && (g_oalIoCtlTable[i].flags & OAL_IOCTL_FLAG_NOCS) == 0 ) { // Release critical section LeaveCriticalSection(&g_ioctlState.cs); } cleanUp: OALMSG(OAL_IOCTL&&OAL_FUNC, (L"-OEMIoControl(rc = %d)\r\n", rc )); return rc; } Where is the g_oalIoCtlTable? It is defined in your BSP. Let's use DeviceEmulator BSP as an example. The PLATFORM\DEVICEEMULATOR\SRC\OAL\OALLIB\ioctl.c defines the table as const OAL_IOCTL_HANDLER g_oalIoCtlTable[] = { #include "ioctl_tab.h" }; And that leads to PLATFORM\DEVICEEMULATOR\SRC\INC\ioctl_tab.h which defined some of IOCTL handler but others are defined in oal_ioctl_tab.h which is under PLATFORM\COMMON\SRC\INC\. Finally, we got the full table body! (Just like tracing MFC, always jumping back and forth). The format of table is very straight forward, IOCTL code, Flags and Handler Function // IOCTL CODE, Flags Handler Function //------------------------------------------------------------------------------ { IOCTL_HAL_INITREGISTRY, 0, OALIoCtlHalInitRegistry }, { IOCTL_HAL_INIT_RTC, 0, OALIoCtlHalInitRTC }, { IOCTL_HAL_REBOOT, 0, OALIoCtlHalReboot }, The PQOAL scans through the table until it find a matched IOCTL code, then invokes the handler function. Since it scans the table from the top which means if we define TWO handler with same IOCTL code, the first one is always invoked with no exception. Now back to the PLATFORM\DEVICEEMULATOR\SRC\INC\ioctl_tab.h, with the following table { IOCTL_HAL_INITREGISTRY, 0, OALIoCtlDeviceEmulatorHalInitRegistry }, ... #include <oal_ioctl_tab.h> Note the IOCTL_HAL_INITREGISTRY handler are defined in both BSP's local ioctl_tab.h and the common oal_ioctl_tab.h, but due to BSP's local handler comes before "#include <oal_ioctl_tab.h>" so we know the OALIoCtlDeviceEmulatorHalInitRegistry always get called. In this example, the DeviceEmulator BSP overrides the IOCTL_HAL_INITREGISTRY handler from OALIoCtlHalInitRegistry to OALIoCtlDeviceEmulatorHalInitRegistry by manipulating the g_oalIoCtlTable table. (In some point of view, it is similar to message map in MFC) Please be aware, when you override an IOCTL handler in PQOAL, you may want to clone the original implementation to your BSP and change to meet your need. It is recommended and save you the redundant works but remember to rename the handler function (Just like the DeviceEmulator it changes the name of OALIoCtlHalInitRegistry to OALIoCtlDeviceEmulatorHalInitRegistry). If you don't change the name, linker may not be happy (due to name conflict) and the more important is by using different handler name, you could always redirect the handler back to original one. (It is like the concept of OOP that calling a function in base class; still not so clear? I am goinf to show you soon!) The OALIoCtlDeviceEmulatorHalInitRegistry setups DeviceEmulator specific registry settings and in the end, if everything goes well, it calls the OALIoCtlHalInitRegistry (PLATFORM\COMMON\SRC\COMMON\IOCTL\reginit.c) to do the rest. if(fOk) { fOk = OALIoCtlHalInitRegistry(code, pInpBuffer, inpSize, pOutBuffer, outSize, pOutSize); } Now you got the picture, whenever you want to override an IOCTL hadnler that is implemented in PQOAL just Clone the handler function to your BSP as a template. Simple name change for the handler function, and a name change in the IOCTL table header file that maps the IOCTL with the function Implement your IOCTL handler and whenever you need to redirect it back just calling the original handler function. It is the standard way of implementing a custom IOCTL and most Microsoft developers prefer. The mapping of IOCTL routine to IOCTL code is platform specific - you control the header file that does that mapping.

Read the article
Executing Stored Procedure for each InputRow + SSIS Script Component.

- by Nev_Rahd

Hello, In my Script Component, am trying to execute Stored Procedure = which return multiple rows = of which need to generate output rows. Code as below: /* Microsoft SQL Server Integration Services Script Component * Write scripts using Microsoft Visual C# 2008. * ScriptMain is the entry point class of the script.*/ using System; using System.Data; using System.Data.SqlClient; using Microsoft.SqlServer.Dts.Pipeline.Wrapper; using Microsoft.SqlServer.Dts.Runtime.Wrapper; [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute] public class ScriptMain : UserComponent { SqlConnection cnn = new SqlConnection(); IDTSConnectionManager100 cnManager; //string cmd; SqlCommand cmd = new SqlCommand(); public override void AcquireConnections(object Transaction) { cnManager = base.Connections.myConnection; cnn = (SqlConnection)cnManager.AcquireConnection(null); } public override void PreExecute() { base.PreExecute(); } public override void PostExecute() { base.PostExecute(); } public override void InputRows_ProcessInputRow(InputRowsBuffer Row) { while(Row.NextRow()) { DataTable dt = new DataTable(); cmd.Connection = cnn; cmd.CommandText = "OSPATTRIBUTE_GetOPNforOP"; cmd.CommandType = CommandType.StoredProcedure; cmd.Parameters.Add("@NK", SqlDbType.VarChar).Value = Row.OPNK.ToString(); cmd.Parameters.Add("@EDWSTARTDATE", SqlDbType.DateTime).Value = Row.EDWEFFECTIVESTARTDATETIME; SqlDataAdapter adapter = new SqlDataAdapter(cmd); adapter.Fill(dt); foreach (DataRow dtrow in dt.Rows) { OutputValidBuffer.AddRow(); OutputValidBuffer.OPNK = Row.OPNK; OutputValidBuffer.OSPTYPECODE = Row.OSPTYPECODE; OutputValidBuffer.ORGPROVTYPEDESC = Row.ORGPROVTYPEDESC; OutputValidBuffer.HEALTHSECTORCODE = Row.HEALTHSECTORCODE; OutputValidBuffer.HEALTHSECTORDESCRIPTION = Row.HEALTHSECTORDESCRIPTION; OutputValidBuffer.EDWEFFECTIVESTARTDATETIME = Row.EDWEFFECTIVESTARTDATETIME; OutputValidBuffer.EDWEFFECTIVEENDDATETIME = Row.EDWEFFECTIVEENDDATETIME; OutputValidBuffer.OPQI = Row.OPQI; OutputValidBuffer.OPNNK = dtrow[0].ToString(); OutputValidBuffer.OSPNAMETYPECODE = dtrow[1].ToString(); OutputValidBuffer.NAMETYPEDESC = dtrow[2].ToString(); OutputValidBuffer.OSPNAME = dtrow[3].ToString(); OutputValidBuffer.EDWEFFECTIVESTARTDATETIME1 = Row.EDWEFFECTIVESTARTDATETIME; OutputValidBuffer.EDWEFFECTIVEENDDATETIME1 = Row.EDWEFFECTIVEENDDATETIME; OutputValidBuffer.OPNQI = dtrow[6].ToString(); } } } public override void ReleaseConnections() { cnManager.ReleaseConnection(cnn); } } This is always skipping the first row. while(Row.NextRow()) is always bringing the second row of the input buffer. What am I doing wrong. Thanks

Read the article

Search Results

Search found 28 results on 2 pages for 'y nk'.

Page 1/2 | 1 2 | Next Page >

- by Kate Moss' Open Space

- by Kate Moss' Big Fan

- by developer

- by Nigel Rivett

- by NK

- by Nev_Rahd

- by Kate Moss' Open Space

- by mr.bio

- by N.K.

- by mr.bio

- by Nev_Rahd

- by Bruce Eitman

- by Psychic

- by AlexandreS

- by Trap

- by Trap

- by Quintofron

- by Nev_Rahd

- by rchrd

- by Kate Moss' Open Space

- by Delfic

- by geeko

- by Kotti

- by Kate Moss' Big Fan

- by Nev_Rahd

1 2 | Next Page >