Search Results

Search found 68828 results on 2754 pages for 'knapsack problem'.

Page 760/2754 | < Previous Page | 756 757 758 759 760 761 762 763 764 765 766 767  | Next Page >

  • Implementing an async "read all currently available data from stream" operation

    - by Jon
    I recently provided an answer to this question: C# - Realtime console output redirection. As often happens, explaining stuff (here "stuff" was how I tackled a similar problem) leads you to greater understanding and/or, as is the case here, "oops" moments. I realized that my solution, as implemented, has a bug. The bug has little practical importance, but it has an extremely large importance to me as a developer: I can't rest easy knowing that my code has the potential to blow up. Squashing the bug is the purpose of this question. I apologize for the long intro, so let's get dirty. I wanted to build a class that allows me to receive input from a console's standard output Stream. Console output streams are of type FileStream; the implementation can cast to that, if needed. There is also an associated StreamReader already present to leverage. There is only one thing I need to implement in this class to achieve my desired functionality: an async "read all the data available this moment" operation. Reading to the end of the stream is not viable because the stream will not end unless the process closes the console output handle, and it will not do that because it is interactive and expecting input before continuing. I will be using that hypothetical async operation to implement event-based notification, which will be more convenient for my callers. The public interface of the class is this: public class ConsoleAutomator { public event EventHandler<ConsoleOutputReadEventArgs> StandardOutputRead; public void StartSendingEvents(); public void StopSendingEvents(); } StartSendingEvents and StopSendingEvents do what they advertise; for the purposes of this discussion, we can assume that events are always being sent without loss of generality. The class uses these two fields internally: protected readonly StringBuilder inputAccumulator = new StringBuilder(); protected readonly byte[] buffer = new byte[256]; The functionality of the class is implemented in the methods below. To get the ball rolling: public void StartSendingEvents(); { this.stopAutomation = false; this.BeginReadAsync(); } To read data out of the Stream without blocking, and also without requiring a carriage return char, BeginRead is called: protected void BeginReadAsync() { if (!this.stopAutomation) { this.StandardOutput.BaseStream.BeginRead( this.buffer, 0, this.buffer.Length, this.ReadHappened, null); } } The challenging part: BeginRead requires using a buffer. This means that when reading from the stream, it is possible that the bytes available to read ("incoming chunk") are larger than the buffer. Remember that the goal here is to read all of the chunk and call event subscribers exactly once for each chunk. To this end, if the buffer is full after EndRead, we don't send its contents to subscribers immediately but instead append them to a StringBuilder. The contents of the StringBuilder are only sent back whenever there is no more to read from the stream. private void ReadHappened(IAsyncResult asyncResult) { var bytesRead = this.StandardOutput.BaseStream.EndRead(asyncResult); if (bytesRead == 0) { this.OnAutomationStopped(); return; } var input = this.StandardOutput.CurrentEncoding.GetString( this.buffer, 0, bytesRead); this.inputAccumulator.Append(input); if (bytesRead < this.buffer.Length) { this.OnInputRead(); // only send back if we 're sure we got it all } this.BeginReadAsync(); // continue "looping" with BeginRead } After any read which is not enough to fill the buffer (in which case we know that there was no more data to be read during the last read operation), all accumulated data is sent to the subscribers: private void OnInputRead() { var handler = this.StandardOutputRead; if (handler == null) { return; } handler(this, new ConsoleOutputReadEventArgs(this.inputAccumulator.ToString())); this.inputAccumulator.Clear(); } (I know that as long as there are no subscribers the data gets accumulated forever. This is a deliberate decision). The good This scheme works almost perfectly: Async functionality without spawning any threads Very convenient to the calling code (just subscribe to an event) Never more than one event for each time data is available to be read Is almost agnostic to the buffer size The bad That last almost is a very big one. Consider what happens when there is an incoming chunk with length exactly equal to the size of the buffer. The chunk will be read and buffered, but the event will not be triggered. This will be followed up by a BeginRead that expects to find more data belonging to the current chunk in order to send it back all in one piece, but... there will be no more data in the stream. In fact, as long as data is put into the stream in chunks with length exactly equal to the buffer size, the data will be buffered and the event will never be triggered. This scenario may be highly unlikely to occur in practice, especially since we can pick any number for the buffer size, but the problem is there. Solution? Unfortunately, after checking the available methods on FileStream and StreamReader, I can't find anything which lets me peek into the stream while also allowing async methods to be used on it. One "solution" would be to have a thread wait on a ManualResetEvent after the "buffer filled" condition is detected. If the event is not signaled (by the async callback) in a small amount of time, then more data from the stream will not be forthcoming and the data accumulated so far should be sent to subscribers. However, this introduces the need for another thread, requires thread synchronization, and is plain inelegant. Specifying a timeout for BeginRead would also suffice (call back into my code every now and then so I can check if there's data to be sent back; most of the time there will not be anything to do, so I expect the performance hit to be negligible). But it looks like timeouts are not supported in FileStream. Since I imagine that async calls with timeouts are an option in bare Win32, another approach might be to PInvoke the hell out of the problem. But this is also undesirable as it will introduce complexity and simply be a pain to code. Is there an elegant way to get around the problem? Thanks for being patient enough to read all of this. Update: I definitely did not communicate the scenario well in my initial writeup. I have since revised the writeup quite a bit, but to be extra sure: The question is about how to implement an async "read all the data available this moment" operation. My apologies to the people who took the time to read and answer without me making my intent clear enough.

    Read the article

  • "Exception has been thrown by the target of an invocation" Running Tests - VS2008 SP1

    - by omatrot
    I'm using Visual Studio 2008 Team Suite and I'm unable to run tests and display the Test/Windows/Test Result Window. The result is a dialog box with the following content : "Exception has been thrown by the target of an invocation". Team Explorer has been installed after Visual Studio 2008 SP1. So I have re-apllied the service pack. Searching the web I found that this error is pretty common but unfortunately, the proposed solutions does not work for me. The problem was never analysed so I decided to give it a try : I reproduced the problem on a computer, attached the process with windbg and start with the basic investigations. Following are the first results : 0:000>!dumpstack OS Thread Id: 0xdb0 (0) Current frame: USER32!NtUserWaitMessage+0x15 ChildEBP RetAddr Caller,Callee 003fec94 75a32674 USER32!DialogBox2+0x222, calling USER32!NtUserWaitMessage 003fecd0 75a3288a USER32!InternalDialogBox+0xe5, calling USER32!DialogBox2 003fecfc 75a6f8d0 USER32!SoftModalMessageBox+0x757, calling USER32!InternalDialogBox 003fed3c 6eb61996 mscorwks!Thread::ReverseLeaveRuntime+0x95, calling mscorwks!_EH_epilog3 003fedb0 75a6fbac USER32!MessageBoxWorker+0x269, calling USER32!SoftModalMessageBox 003fede0 6ea559c3 mscorwks!SetupThreadNoThrow+0x19a, calling mscorwks!_EH_epilog3_catch_GS 003fee24 6eb61d8a mscorwks!HasIllegalReentrancy+0xac, calling mscorwks!_EH_epilog3 003fee30 6ea89796 mscorwks!SimpleComCallWrapper::Release+0x2e, calling mscorwks!CompareExchangeMP 003fee38 6ea0da05 mscorwks!CLRException::HandlerState::CleanupTry+0x16, calling mscorwks!GetCurrentSEHRecord 003fee44 6ea0c9c0 mscorwks!Thread::EnablePreemptiveGC+0xf, calling mscorwks!Thread::CatchAtSafePoint 003fee4c 6ea8a241 mscorwks!Unknown_Release_Internal+0x24d, calling mscorwks!GCHolder<1,0,0>::Pop 003fee50 6ea0c86c mscorwks!_EH_epilog3_catch_GS+0xa, calling mscorwks!__security_check_cookie 003fee54 6ea8a24c mscorwks!Unknown_Release_Internal+0x258, calling mscorwks!_EH_epilog3_catch_GS 003fee7c 75a16941 USER32!UserCallWinProcCheckWow+0x13d, calling ntdll!RtlDeactivateActivationContextUnsafeFast 003feed8 7082119e msenv!ATL::CComCritSecLock<ATL::CComCriticalSection>::Lock+0xd, calling ntdll!RtlEnterCriticalSection 003fef08 75a6fe5b USER32!MessageBoxIndirectW+0x2e, calling USER32!MessageBoxWorker 003fef7c 70a1e367 msenv!MessageBoxPVoidW+0xda 003fefd4 70a1db60 msenv!VBDialogCover2+0x11b 003ff01c 70a1e4c0 msenv!VBMessageBox2W+0xf0, calling msenv!VBDialogCover2 003ff044 7087246b msenv!main_GetAppNameW+0xa, calling msenv!GetAppNameInternal 003ff04c 70a1e4f2 msenv!VBMessageBox3W+0x1c, calling msenv!VBMessageBox2W 003ff064 70a1d6d7 msenv!_IdMsgShow+0x362, calling msenv!VBMessageBox3W 003ff0cc 70951841 msenv!TaskDialogCallback+0x7e0, calling msenv!_IdMsgShow 003ff118 6eb20da4 mscorwks!Unknown_QueryInterface+0x230, calling mscorwks!_EH_epilog3_catch_GS 003ff14c 6eb20c43 mscorwks!Unknown_QueryInterface_Internal+0x3d8, calling mscorwks!_EH_epilog3_catch_GS 003ff168 02006ec4 02006ec4, calling 0247a1e8 003ff16c 6ea0c86c mscorwks!_EH_epilog3_catch_GS+0xa, calling mscorwks!__security_check_cookie 003ff198 6eb20562 mscorwks!COMToCLRWorker+0xb34, calling mscorwks!_EH_epilog3_catch_GS 003ff19c 0247a235 0247a235, calling mscorwks!COMToCLRWorker 003ff1c4 7083249f msenv!CVSCommandTarget::ExecCmd+0x937 003ff1e4 7086d5c8 msenv!VsReportErrorInfo+0x11, calling msenv!TaskDialogCallback+0xd8 003ff1f8 7093e65b msenv!CVSCommandTarget::ExecCmd+0x945, calling msenv!VsReportErrorInfo 003ff25c 7081f53a msenv!ATL::CComPtr<IVsLanguageInfo>::~CComPtr<IVsLanguageInfo>+0x24, calling msenv!_EH_epilog3 003ff260 70b18d72 msenv!LogCommand+0x4c, calling msenv!ATL::CComPtr<IVsCodePageSelection>::~CComPtr<IVsCodePageSelection> 003ff264 70b18d77 msenv!LogCommand+0x51, calling msenv!_EH_epilog3 003ff280 70a4fd0e msenv!CMsoButtonUser::FClick+0x1d1, calling msenv!CVSCommandTarget::ExecCmd 003ff2f4 70823a87 msenv!CTLSITE::QueryInterface+0x16 003ff31c 70cb7d4d msenv!TBCB::FNotifyFocus+0x204 003ff35c 70ce5fda msenv!TB::NotifyControl+0x101 003ff3bc 709910f6 msenv!TB::FRequestFocus+0x4ed, calling msenv!TB::NotifyControl 003ff414 708254ba msenv!CMsoButtonUser::FEnabled+0x3d, calling msenv!GetQueryStatusFlags 003ff428 7086222a msenv!TBC::FAutoEnabled+0x24 003ff43c 7098e1eb msenv!TB::LProcessInputMsg+0xdb4 003ff458 6bec1c49 (MethodDesc 0x6bcd7f54 +0x89 System.Windows.Forms.Form.DefWndProc(System.Windows.Forms.Message ByRef)), calling 6be3b738 003ff50c 70823ab0 msenv!FPtbFromSite+0x16 003ff520 70991c43 msenv!TB::PtbParent+0x25, calling msenv!FPtbFromSite 003ff52c 708dda49 msenv!TBWndProc+0x2da 003ff588 0203d770 0203d770, calling 0247a1e8 003ff598 70822a70 msenv!CPaneFrame::Release+0x118, calling msenv!_EH_epilog3 003ff5b0 75a16238 USER32!InternalCallWinProc+0x23 003ff5dc 75a168ea USER32!UserCallWinProcCheckWow+0x109, calling USER32!InternalCallWinProc 003ff620 75a16899 USER32!UserCallWinProcCheckWow+0x6a, calling ntdll!RtlActivateActivationContextUnsafeFast 003ff654 75a17d31 USER32!DispatchMessageWorker+0x3bc, calling USER32!UserCallWinProcCheckWow 003ff688 70847f2b msenv!CMsoComponent::FPreTranslateMessage+0x72, calling msenv!MainFTranslateMessage 003ff6b4 75a17dfa USER32!DispatchMessageW+0xf, calling USER32!DispatchMessageWorker 003ff6c4 70831553 msenv!EnvironmentMsgLoop+0x1ea, calling USER32!DispatchMessageW 003ff6f8 708eb9bd msenv!CMsoCMHandler::FPushMessageLoop+0x86, calling msenv!EnvironmentMsgLoop 003ff724 708eb94d msenv!SCM::FPushMessageLoop+0xb7 003ff74c 708eb8e9 msenv!SCM_MsoCompMgr::FPushMessageLoop+0x28, calling msenv!SCM::FPushMessageLoop 003ff768 708eb8b8 msenv!CMsoComponent::PushMsgLoop+0x28 003ff788 708ebe4e msenv!VStudioMainLogged+0x482, calling msenv!CMsoComponent::PushMsgLoop 003ff7ac 70882afe msenv!CVsActivityLogSingleton::Instance+0xdf, calling msenv!_EH_epilog3 003ff7d8 70882afe msenv!CVsActivityLogSingleton::Instance+0xdf, calling msenv!_EH_epilog3 003ff7dc 707e4e31 msenv!VActivityLogStartupEntries+0x42 003ff7f4 7081f63b msenv!ATL::CComPtr<IClassFactory>::~CComPtr<IClassFactory>+0x24, calling msenv!_EH_epilog3 003ff7f8 708b250f msenv!ATL::CComQIPtr<IUnknown,&IID_IUnknown>::~CComQIPtr<IUnknown,&IID_IUnknown>+0x1d, calling msenv!_EH_epilog3 003ff820 708e7561 msenv!VStudioMain+0xc1, calling msenv!VStudioMainLogged 003ff84c 2f32aabc devenv!util_CallVsMain+0xff 003ff878 2f3278f2 devenv!CDevEnvAppId::Run+0x11fd, calling devenv!util_CallVsMain 003ff97c 77533b23 ntdll!RtlpAllocateHeap+0xe73, calling ntdll!_SEH_epilog4 003ff9f0 77536cd7 ntdll!RtlpLowFragHeapAllocFromContext+0x882, calling ntdll!RtlpSubSegmentInitialize 003ffa10 7753609f ntdll!RtlNtStatusToDosError+0x3b, calling ntdll!RtlNtStatusToDosErrorNoTeb 003ffa14 775360a4 ntdll!RtlNtStatusToDosError+0x40, calling ntdll!_SEH_epilog4 003ffa40 775360a4 ntdll!RtlNtStatusToDosError+0x40, calling ntdll!_SEH_epilog4 003ffa44 75bd2736 kernel32!LocalBaseRegOpenKey+0x159, calling ntdll!RtlNtStatusToDosError 003ffa48 75bd2762 kernel32!LocalBaseRegOpenKey+0x22a, calling kernel32!_SEH_epilog4 003ffac4 75bd2762 kernel32!LocalBaseRegOpenKey+0x22a, calling kernel32!_SEH_epilog4 003ffac8 75bd28c9 kernel32!RegOpenKeyExInternalW+0x130, calling kernel32!LocalBaseRegOpenKey 003ffad8 75bd28de kernel32!RegOpenKeyExInternalW+0x211 003ffae0 75bd28e5 kernel32!RegOpenKeyExInternalW+0x21d, calling kernel32!_SEH_epilog4 003ffb04 6f282e2b MSVCR90!_unlock+0x15, calling ntdll!RtlLeaveCriticalSection 003ffb14 75bd2642 kernel32!BaseRegCloseKeyInternal+0x41, calling ntdll!NtClose 003ffb28 75bd25d0 kernel32!RegCloseKey+0xd4, calling kernel32!_SEH_epilog4 003ffb5c 75bd25d0 kernel32!RegCloseKey+0xd4, calling kernel32!_SEH_epilog4 003ffb60 2f321ea4 devenv!DwInitSyncObjects+0x340 003ffb90 2f327bf4 devenv!WinMain+0x74, calling devenv!CDevEnvAppId::Run 003ffbac 2f327c68 devenv!License::GetPID+0x258, calling devenv!WinMain 003ffc3c 75bd3677 kernel32!BaseThreadInitThunk+0xe 003ffc48 77539d72 ntdll!__RtlUserThreadStart+0x70 003ffc88 77539d45 ntdll!_RtlUserThreadStart+0x1b, calling ntdll!__RtlUserThreadStart 0:000> !pe -nested Exception object: 050aae9c Exception type: System.Reflection.TargetInvocationException Message: Exception has been thrown by the target of an invocation. InnerException: System.NullReferenceException, use !PrintException 050aac64 to see more StackTrace (generated): SP IP Function 003FEC2C 6D2700F7 mscorlib_ni!System.RuntimeType.CreateInstanceSlow(Boolean, Boolean)+0x57 003FEC5C 6D270067 mscorlib_ni!System.RuntimeType.CreateInstanceImpl(Boolean, Boolean, Boolean)+0xe7 003FEC94 6D270264 mscorlib_ni!System.Activator.CreateInstance(System.Type, Boolean)+0x44 003FECA4 6AD02DAF Microsoft_VisualStudio_Shell_9_0_ni!Microsoft.VisualStudio.Shell.Package.CreateToolWindow(System.Type, Int32, Microsoft.VisualStudio.Shell.ProvideToolWindowAttribute)+0x67 003FED30 6AD0311B Microsoft_VisualStudio_Shell_9_0_ni!Microsoft.VisualStudio.Shell.Package.CreateToolWindow(System.Type, Int32)+0xb7 003FED58 6AD02D12 Microsoft_VisualStudio_Shell_9_0_ni!Microsoft.VisualStudio.Shell.Package.FindToolWindow(System.Type, Int32, Boolean, Microsoft.VisualStudio.Shell.ProvideToolWindowAttribute)+0x7a 003FED88 6AD02D39 Microsoft_VisualStudio_Shell_9_0_ni!Microsoft.VisualStudio.Shell.Package.FindToolWindow(System.Type, Int32, Boolean)+0x11 003FED94 02585E30 Microsoft_VisualStudio_QualityTools_TestCaseManagement!Microsoft.VisualStudio.TestTools.TestCaseManagement.QualityToolsPackage.InitToolWindowVariable[[System.__Canon, mscorlib]](System.__Canon ByRef, System.String, Boolean)+0x58 003FEDD0 02585DBE Microsoft_VisualStudio_QualityTools_TestCaseManagement!Microsoft.VisualStudio.TestTools.TestCaseManagement.QualityToolsPackage.InitToolWindowVariable[[System.__Canon, mscorlib]](System.__Canon ByRef, System.String)+0x36 003FEDE4 02585D32 Microsoft_VisualStudio_QualityTools_TestCaseManagement!Microsoft.VisualStudio.TestTools.TestCaseManagement.QualityToolsPackage.ShowToolWindow[[System.__Canon, mscorlib]](System.__Canon ByRef, System.String, Boolean)+0x3a 003FEE00 02585AB4 Microsoft_VisualStudio_QualityTools_TestCaseManagement!Microsoft.VisualStudio.TestTools.TestCaseManagement.QualityToolsPackage.OpenTestResultsToolWindow()+0x2c 003FEE10 02585A6E Microsoft_VisualStudio_QualityTools_TestCaseManagement!Microsoft.VisualStudio.TestTools.TestCaseManagement.QualityToolsPackage.OnMenuViewTestResults(System.Object, System.EventArgs)+0x6 003FEE18 6CD4F993 System_ni!System.ComponentModel.Design.MenuCommand.Invoke()+0x43 003FEE40 6CD4F9D4 System_ni!System.ComponentModel.Design.MenuCommand.Invoke(System.Object)+0x8 003FEE48 6AD000FA Microsoft_VisualStudio_Shell_9_0_ni!Microsoft.VisualStudio.Shell.OleMenuCommandService.Microsoft.VisualStudio.OLE.Interop.IOleCommandTarget.Exec(System.Guid ByRef, UInt32, UInt32, IntPtr, IntPtr)+0x11a 003FEEA0 6AD03FB8 Microsoft_VisualStudio_Shell_9_0_ni!Microsoft.VisualStudio.Shell.Package.Microsoft.VisualStudio.OLE.Interop.IOleCommandTarget.Exec(System.Guid ByRef, UInt32, UInt32, IntPtr, IntPtr)+0x44 StackTraceString: <none> HResult: 80131604 0:000> !PrintException 050aac64 Exception object: 050aac64 Exception type: System.NullReferenceException Message: Object reference not set to an instance of an object. InnerException: <none> StackTrace (generated): SP IP Function 003FE660 078E60BE Microsoft_VisualStudio_TeamSystem_Integration!Microsoft.VisualStudio.TeamSystem.Integration.TcmResultsPublishManager..ctor(Microsoft.VisualStudio.TeamSystem.Integration.ResultsPublishManager)+0xc6 003FE674 078E5C91 Microsoft_VisualStudio_TeamSystem_Integration!Microsoft.VisualStudio.TeamSystem.Integration.ResultsPublishManager..ctor(Microsoft.VisualStudio.TeamSystem.Integration.TeamFoundationHostHelper)+0x59 003FE684 078E2FA0 Microsoft_VisualStudio_TeamSystem_Integration!Microsoft.VisualStudio.TeamSystem.Integration.VsetServerHelper..ctor(System.IServiceProvider)+0x50 003FE6A4 078E2E90 Microsoft_VisualStudio_TeamSystem_Common!Microsoft.VisualStudio.TeamSystem.Integration.Client.VsetHelper.InitializeThrow(System.IServiceProvider)+0x20 003FE6B8 078E2E2A Microsoft_VisualStudio_TeamSystem_Common!Microsoft.VisualStudio.TeamSystem.Integration.Client.VsetHelper.InitializeHelper(System.IServiceProvider)+0x22 003FE6E0 078E2DEC Microsoft_VisualStudio_TeamSystem_Common!Microsoft.VisualStudio.TeamSystem.Integration.Client.VsetHelper.CreateVsetHelper(System.IServiceProvider)+0x1c 003FE6F0 078E2DAC Microsoft_VisualStudio_QualityTools_TestCaseManagement!Microsoft.VisualStudio.TestTools.TestCaseManagement.QualityToolsPackage.get_VsetHelper()+0x14 003FE6F8 02586BBE Microsoft_VisualStudio_QualityTools_TestCaseManagement!Microsoft.VisualStudio.TestTools.TestCaseManagement.ResultsToolWindow..ctor()+0x9f6 003FE798 02585F8A Microsoft_VisualStudio_QualityTools_TestCaseManagement!Microsoft.VisualStudio.TestTools.TestCaseManagement.ResultToolWindowHost..ctor()+0x1a StackTraceString: <none> HResult: 80004003 In order to be able to continue the analysis, we need to get the parameters to see what is going on. I also tried to run devenv.exe with the /log switch. No error in the log after reproducing the problem. Finally, If Team Explorer is removed from the system, the problem goes away. Any help appreciated. TIA. Olivier.

    Read the article

  • Solving embarassingly parallel problems using Python multiprocessing

    - by gotgenes
    How does one use multiprocessing to tackle embarrassingly parallel problems? Embarassingly parallel problems typically consist of three basic parts: Read input data (from a file, database, tcp connection, etc.). Run calculations on the input data, where each calculation is independent of any other calculation. Write results of calculations (to a file, database, tcp connection, etc.). We can parallelize the program in two dimensions: Part 2 can run on multiple cores, since each calculation is independent; order of processing doesn't matter. Each part can run independently. Part 1 can place data on an input queue, part 2 can pull data off the input queue and put results onto an output queue, and part 3 can pull results off the output queue and write them out. This seems a most basic pattern in concurrent programming, but I am still lost in trying to solve it, so let's write a canonical example to illustrate how this is done using multiprocessing. Here is the example problem: Given a CSV file with rows of integers as input, compute their sums. Separate the problem into three parts, which can all run in parallel: Process the input file into raw data (lists/iterables of integers) Calculate the sums of the data, in parallel Output the sums Below is traditional, single-process bound Python program which solves these three tasks: #!/usr/bin/env python # -*- coding: UTF-8 -*- # basicsums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file. """ import csv import optparse import sys def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) return cli_parser def parse_input_csv(csvfile): """Parses the input CSV and yields tuples with the index of the row as the first element, and the integers of the row as the second element. The index is zero-index based. :Parameters: - `csvfile`: a `csv.reader` instance """ for i, row in enumerate(csvfile): row = [int(entry) for entry in row] yield i, row def sum_rows(rows): """Yields a tuple with the index of each input list of integers as the first element, and the sum of the list of integers as the second element. The index is zero-index based. :Parameters: - `rows`: an iterable of tuples, with the index of the original row as the first element, and a list of integers as the second element """ for i, row in rows: yield i, sum(row) def write_results(csvfile, results): """Writes a series of results to an outfile, where the first column is the index of the original row of data, and the second column is the result of the calculation. The index is zero-index based. :Parameters: - `csvfile`: a `csv.writer` instance to which to write results - `results`: an iterable of tuples, with the index (zero-based) of the original row as the first element, and the calculated result from that row as the second element """ for result_row in results: csvfile.writerow(result_row) def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # gets an iterable of rows that's not yet evaluated input_rows = parse_input_csv(in_csvfile) # sends the rows iterable to sum_rows() for results iterable, but # still not evaluated result_rows = sum_rows(input_rows) # finally evaluation takes place as a chain in write_results() write_results(out_csvfile, result_rows) infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) Let's take this program and rewrite it to use multiprocessing to parallelize the three parts outlined above. Below is a skeleton of this new, parallelized program, that needs to be fleshed out to address the parts in the comments: #!/usr/bin/env python # -*- coding: UTF-8 -*- # multiproc_sums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file, using multiple processes if desired. """ import csv import multiprocessing import optparse import sys NUM_PROCS = multiprocessing.cpu_count() def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) cli_parser.add_option('-n', '--numprocs', type='int', default=NUM_PROCS, help="Number of processes to launch [DEFAULT: %default]") return cli_parser def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # Parse the input file and add the parsed data to a queue for # processing, possibly chunking to decrease communication between # processes. # Process the parsed data as soon as any (chunks) appear on the # queue, using as many processes as allotted by the user # (opts.numprocs); place results on a queue for output. # # Terminate processes when the parser stops putting data in the # input queue. # Write the results to disk as soon as they appear on the output # queue. # Ensure all child processes have terminated. # Clean up files. infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) These pieces of code, as well as another piece of code that can generate example CSV files for testing purposes, can be found on github. I would appreciate any insight here as to how you concurrency gurus would approach this problem. Here are some questions I had when thinking about this problem. Bonus points for addressing any/all: Should I have child processes for reading in the data and placing it into the queue, or can the main process do this without blocking until all input is read? Likewise, should I have a child process for writing the results out from the processed queue, or can the main process do this without having to wait for all the results? Should I use a processes pool for the sum operations? If yes, what method do I call on the pool to get it to start processing the results coming into the input queue, without blocking the input and output processes, too? apply_async()? map_async()? imap()? imap_unordered()? Suppose we didn't need to siphon off the input and output queues as data entered them, but could wait until all input was parsed and all results were calculated (e.g., because we know all the input and output will fit in system memory). Should we change the algorithm in any way (e.g., not run any processes concurrently with I/O)?

    Read the article

  • Implementing a robust async stream reader

    - by Jon
    I recently provided an answer to this question: C# - Realtime console output redirection. As often happens, explaining stuff (here "stuff" was how I tackled a similar problem) leads you to greater understanding and/or, as is the case here, "oops" moments. I realized that my solution, as implemented, has a bug. The bug has little practical importance, but it has an extremely large importance to me as a developer: I can't rest easy knowing that my code has the potential to blow up. Squashing the bug is the purpose of this question. I apologize for the long intro, so let's get dirty. I wanted to build a class that allows me to receive input from a Stream in an event-based manner. The stream, in my scenario, is guaranteed to be a FileStream and there is also an associated StreamReader already present to leverage. The public interface of the class is this: public class MyStreamManager { public event EventHandler<ConsoleOutputReadEventArgs> StandardOutputRead; public void StartSendingEvents(); public void StopSendingEvents(); } Obviously this specific scenario has to do with a console's standard output, but that is a detail and does not play an important role. StartSendingEvents and StopSendingEvents do what they advertise; for the purposes of this discussion, we can assume that events are always being sent without loss of generality. The class uses these two fields internally: protected readonly StringBuilder inputAccumulator = new StringBuilder(); protected readonly byte[] buffer = new byte[256]; The functionality of the class is implemented in the methods below. To get the ball rolling: public void StartSendingEvents(); { this.stopAutomation = false; this.BeginReadAsync(); } To read data out of the Stream without blocking, and also without requiring a carriage return char, BeginRead is called: protected void BeginReadAsync() { if (!this.stopAutomation) { this.StandardOutput.BaseStream.BeginRead( this.buffer, 0, this.buffer.Length, this.ReadHappened, null); } } The challenging part: BeginRead requires using a buffer. This means that when reading from the stream, it is possible that the bytes available to read ("incoming chunk") are larger than the buffer. Since we are only handing off data from the stream to a consumer, and that consumer may well have inside knowledge about the size and/or format of these chunks, I want to call event subscribers exactly once for each chunk. Otherwise the abstraction breaks down and the subscribers have to buffer the incoming data and reconstruct the chunks themselves using said knowledge. This is much less convenient to the calling code, and detracts from the usefulness of my class. To this end, if the buffer is full after EndRead, we don't send its contents to subscribers immediately but instead append them to a StringBuilder. The contents of the StringBuilder are only sent back whenever there is no more to read from the stream (thus preserving the chunks). private void ReadHappened(IAsyncResult asyncResult) { var bytesRead = this.StandardOutput.BaseStream.EndRead(asyncResult); if (bytesRead == 0) { this.OnAutomationStopped(); return; } var input = this.StandardOutput.CurrentEncoding.GetString( this.buffer, 0, bytesRead); this.inputAccumulator.Append(input); if (bytesRead < this.buffer.Length) { this.OnInputRead(); // only send back if we 're sure we got it all } this.BeginReadAsync(); // continue "looping" with BeginRead } After any read which is not enough to fill the buffer, all accumulated data is sent to the subscribers: private void OnInputRead() { var handler = this.StandardOutputRead; if (handler == null) { return; } handler(this, new ConsoleOutputReadEventArgs(this.inputAccumulator.ToString())); this.inputAccumulator.Clear(); } (I know that as long as there are no subscribers the data gets accumulated forever. This is a deliberate decision). The good This scheme works almost perfectly: Async functionality without spawning any threads Very convenient to the calling code (just subscribe to an event) Maintains the "chunkiness" of the data; this allows the calling code to use inside knowledge of the data without doing any extra work Is almost agnostic to the buffer size (it will work correctly with any size buffer irrespective of the data being read) The bad That last almost is a very big one. Consider what happens when there is an incoming chunk with length exactly equal to the size of the buffer. The chunk will be read and buffered, but the event will not be triggered. This will be followed up by a BeginRead that expects to find more data belonging to the current chunk in order to send it back all in one piece, but... there will be no more data in the stream. In fact, as long as data is put into the stream in chunks with length exactly equal to the buffer size, the data will be buffered and the event will never be triggered. This scenario may be highly unlikely to occur in practice, especially since we can pick any number for the buffer size, but the problem is there. Solution? Unfortunately, after checking the available methods on FileStream and StreamReader, I can't find anything which lets me peek into the stream while also allowing async methods to be used on it. One "solution" would be to have a thread wait on a ManualResetEvent after the "buffer filled" condition is detected. If the event is not signaled (by the async callback) in a small amount of time, then more data from the stream will not be forthcoming and the data accumulated so far should be sent to subscribers. However, this introduces the need for another thread, requires thread synchronization, and is plain inelegant. Specifying a timeout for BeginRead would also suffice (call back into my code every now and then so I can check if there's data to be sent back; most of the time there will not be anything to do, so I expect the performance hit to be negligible). But it looks like timeouts are not supported in FileStream. Since I imagine that async calls with timeouts are an option in bare Win32, another approach might be to PInvoke the hell out of the problem. But this is also undesirable as it will introduce complexity and simply be a pain to code. Is there an elegant way to get around the problem? Thanks for being patient enough to read all of this.

    Read the article

  • Can this Query be corrected or different table structure needed? (database dumps provided)

    - by sandeepan
    This is a bit lengthy but I have provided sufficient details and kept things very clear. Please see if you can help. (I will surely accept answer if it solves my problem) I am sure a person experienced with this can surely help or suggest me to decide the tables structure. About the system:- There are tutors who create classes A tags based search approach is being followed Tag relations are created/edited when new tutors registers/edits profile data and when tutors create classes (this makes tutors and classes searcheable).For simplicity, let us consider only tutor name and class name are the fields which are matched against search keywords. In this example, I am considering - tutor "Sandeepan Nath" has created a class called "first class" tutor "Bob Cratchit" has created a class called "new class" Desired search results- AND logic to be appied on the search keywords and match against class and tutor data(class name + tutor name), in other words, All those classes be shown such that all the search terms are present in the class name or its tutor name. Example to be clear - Searching "first class" returns class with id_wc = 1. Working Searching "Sandeepan class" should also return class with id_wc = 1. Not working in System 2. Problem with profile editing and searching To tell in one sentence, I am facing a conflict between the ease of profile edition (edition of tag relations when tutor profiles are edited) and the ease of search logic. In the beginning, we had one table structure and search was easy but tag edition logic was very clumsy and unmaintainable(Check System 1 in the section below) . So we created separate tag relations tables to make profile edition simpler but search has become difficult. Please dump the tables so that you can run the search query I have given below and see the results. System 1 (previous system - search easy - profile edition difficult):- Only one table called All_Tag_Relations table had the all the tag relations. The tags table below is common to both systems 1 and 2. CREATE TABLE IF NOT EXISTS `all_tag_relations` ( `id_tag_rel` int(10) NOT NULL AUTO_INCREMENT, `id_tag` int(10) unsigned NOT NULL DEFAULT '0', `id_tutor` int(10) DEFAULT NULL, `id_wc` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id_tag_rel`), KEY `All_Tag_Relations_FKIndex1` (`id_tag`), KEY `id_wc` (`id_wc`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `all_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`, `id_wc`) VALUES (1, 1, 1, NULL), (2, 2, 1, NULL), (3, 1, 1, 1), (4, 2, 1, 1), (5, 3, 1, 1), (6, 4, 1, 1), (7, 6, 2, NULL), (8, 7, 2, NULL), (9, 6, 2, 2), (10, 7, 2, 2), (11, 5, 2, 2), (12, 4, 2, 2); CREATE TABLE IF NOT EXISTS `tags` ( `id_tag` int(10) unsigned NOT NULL AUTO_INCREMENT, `tag` varchar(255) DEFAULT NULL, PRIMARY KEY (`id_tag`), UNIQUE KEY `tag` (`tag`), KEY `id_tag` (`id_tag`), KEY `tag_2` (`tag`), KEY `tag_3` (`tag`), KEY `tag_4` (`tag`), FULLTEXT KEY `tag_5` (`tag`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ; INSERT INTO `tags` (`id_tag`, `tag`) VALUES (1, 'Sandeepan'), (2, 'Nath'), (3, 'first'), (4, 'class'), (5, 'new'), (6, 'Bob'), (7, 'Cratchit'); Please note that for every class, the tag rels of its tutor have to be duplicated. Example, for class with id_wc=1, the tag rel records with id_tag_rel = 3 and 4 are actually extras if you compare with the tag rel records with id_tag_rel = 1 and 2. System 2 (present system - profile edition easy, search difficult) Two separate tables Tutors_Tag_Relations and Webclasses_Tag_Relations have the corresponding tag relations data (Please dump into a separate database)- CREATE TABLE IF NOT EXISTS `tutors_tag_relations` ( `id_tag_rel` int(10) NOT NULL AUTO_INCREMENT, `id_tag` int(10) unsigned NOT NULL DEFAULT '0', `id_tutor` int(10) DEFAULT NULL, PRIMARY KEY (`id_tag_rel`), KEY `All_Tag_Relations_FKIndex1` (`id_tag`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `tutors_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`) VALUES (1, 1, 1), (2, 2, 1), (3, 6, 2), (4, 7, 2); CREATE TABLE IF NOT EXISTS `webclasses_tag_relations` ( `id_tag_rel` int(10) NOT NULL AUTO_INCREMENT, `id_tag` int(10) unsigned NOT NULL DEFAULT '0', `id_tutor` int(10) DEFAULT NULL, `id_wc` int(10) DEFAULT NULL, PRIMARY KEY (`id_tag_rel`), KEY `webclasses_Tag_Relations_FKIndex1` (`id_tag`), KEY `id_wc` (`id_wc`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `webclasses_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`, `id_wc`) VALUES (1, 3, 1, 1), (2, 4, 1, 1), (3, 5, 2, 2), (4, 4, 2, 2); CREATE TABLE IF NOT EXISTS `tags` ( `id_tag` int(10) unsigned NOT NULL AUTO_INCREMENT, `tag` varchar(255) DEFAULT NULL, PRIMARY KEY (`id_tag`), UNIQUE KEY `tag` (`tag`), KEY `id_tag` (`id_tag`), KEY `tag_2` (`tag`), KEY `tag_3` (`tag`), KEY `tag_4` (`tag`), FULLTEXT KEY `tag_5` (`tag`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ; INSERT INTO `tags` (`id_tag`, `tag`) VALUES (1, 'Sandeepan'), (2, 'Nath'), (3, 'first'), (4, 'class'), (5, 'new'), (6, 'Bob'), (7, 'Cratchit'); CREATE TABLE IF NOT EXISTS `all_tag_relations` ( `id_tag_rel` int(10) NOT NULL AUTO_INCREMENT, `id_tag` int(10) unsigned NOT NULL DEFAULT '0', `id_tutor` int(10) DEFAULT NULL, `id_wc` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id_tag_rel`), KEY `All_Tag_Relations_FKIndex1` (`id_tag`), KEY `id_wc` (`id_wc`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; insert into All_Tag_Relations select NULL,id_tag,id_tutor,NULL from Tutors_Tag_Relations; insert into All_Tag_Relations select NULL,id_tag,id_tutor,id_wc from Webclasses_Tag_Relations; Here you can see how easily tutor first name can be edited only in one place. But search has become really difficult, so on being advised to use a Temporary table, I am creating one at every search request, then dumping all the necessary data and then searching from it, I am creating this All_Tag_Relations table at search run time. Here I am just dumping all the data from the two tables Tutors_Tag_Relations and Webclasses_Tag_Relations. But, I am still not able to get classes if I search with tutor name This is the query which searches "first class". Running them on both the systems shows correct results (returns the class with id_wc = 1). SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =3)) AS key_1_total_matches, SUM(DISTINCT( wtagrels.id_tag =4)) AS key_2_total_matches FROM all_tag_relations AS wtagrels WHERE ( wtagrels.id_tag =3 OR wtagrels.id_tag =4 ) GROUP BY wtagrels.id_wc HAVING key_1_total_matches = 1 AND key_2_total_matches = 1 LIMIT 0, 20 But, searching for "Sandeepan class" works only with the 1st system Here is the query which searches "Sandeepan class" SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =1)) AS key_1_total_matches, SUM(DISTINCT( wtagrels.id_tag =4)) AS key_2_total_matches FROM all_tag_relations AS wtagrels WHERE ( wtagrels.id_tag =1 OR wtagrels.id_tag =4 ) GROUP BY wtagrels.id_wc HAVING key_1_total_matches = 1 AND key_2_total_matches = 1 LIMIT 0, 20 Can anybody alter this query and somehow do a proper join or something to get correct results. That solves my problem in a nice way. As you can figure out, the reason why it does not work in system 2 is that in system 1, for every class, one additional tag relation linking class and tutor name is present. e.g. for class first class, (records with id_tag_rel 3 and 4) which returns the class on searching with tutor name. So, you see the trade-off between the search and profile edition difficulty with the two systems. How do I overcome both. I have to reach a conclusion soon. So far my reasoning is it is definitely not good from a code maintainability point of view to follow the single tag rel table structure of system one, because in a real system while editing a field like "tutor qualifications", there can be as many records in tag rels table as there are words in qualification of a tutor (one word in a field = one tag relation). Now suppose a tutor has 100 classes. When he edits his qualification, all the tag rel rows corresponding to him are deleted and then as many copies are to be created (as per the new qualification data) as there are classes. This becomes particularly difficult if later more searcheable fields are added. The code cannot be robust. Is the best solution to follow system 2 (edition has to be in one table - no extra work for each and every class) and somehow re-create the all_tag_relations table like system 1 (from the tables tutor_tag_relations and webclasses_tag_relations), creating the extra tutor tag rels for each and every class by a tutor (which is currently missing in system 2's temporary all_tag_relations table). That would be a time consuming logic script. I doubt that table can be recreated without resorting to PHP sript (mysql alone cannot do that). But the problem is that running all this at search time will make search definitely slow. So, how do such systems work? How are such situations handled? I thought about we can run a cron which initiates that PHP script, say every 1 minute and replaces the existing all_tag_relations table as per new tag rels from tutor_tag_relations and webclasses_tag_relations (replaces means creates a new table, deletes the original and renames the new one as all_tag_relations, otherwise search won't work during that period- or is there any better way to that?). Anyway, the result would be that any changes by tutors will reflect in search in the next 1 minute and not immediately. An alternateve would be to initate that PHP script every time a tutor edits his profile. But here again, since many users may edit their profiles concurrently, will the creation of so many tables be a burden and can mysql make the server slow? Any help would be appreciated and working solution will be accepted as answer. Thanks, Sandeepan

    Read the article

  • Dealing with external processes

    - by Jesse Aldridge
    I've been working on a gui app that needs to manage external processes. Working with external processes leads to a lot of issues that can make a programmer's life difficult. I feel like maintenence on this app is taking an unacceptably long time. I've been trying to list the things that make working with external processes difficult so that I can come up with ways of mitigating the pain. This kind of turned into a rant which I thought I'd post here in order to get some feedback and to provide some guidance to anybody thinking about sailing into these very murky waters. Here's what I've got so far: Output from the child can get mixed up with output from the parent. This can make both outputs misleading and hard to read. It can be hard to tell what came from where. It becomes harder to figure out what's going on when things are asynchronous. Here's a contrived example: import textwrap, os, time from subprocess import Popen test_path = 'test_file.py' with open(test_path, 'w') as file: file.write(textwrap.dedent(''' import time for i in range(3): print 'Hello %i' % i time.sleep(1)''')) proc = Popen('python -B "%s"' % test_path) for i in range(3): print 'Hello %i' % i time.sleep(1) os.remove(test_path) I guess I could have the child process write its output to a file. But it can be annoying to have to open up a file every time I want to see the result of a print statement. If I have code for the child process I could add a label, something like print 'child: Hello %i', but it can be annoying to do that for every print. And it adds some noise to the output. And of course I can't do it if I don't have access to the code. I could manually manage the process output. But then you open up a huge can of worms with threads and polling and stuff like that. A simple solution is to treat processes like synchronous functions, that is, no further code executes until the process completes. In other words, make the process block. But that doesn't work if you're building a gui app. Which brings me to the next problem... Blocking processes cause the gui to become unresponsive. import textwrap, sys, os from subprocess import Popen from PyQt4.QtGui import * from PyQt4.QtCore import * test_path = 'test_file.py' with open(test_path, 'w') as file: file.write(textwrap.dedent(''' import time for i in range(3): print 'Hello %i' % i time.sleep(1)''')) app = QApplication(sys.argv) button = QPushButton('Launch process') def launch_proc(): # Can't move the window until process completes proc = Popen('python -B "%s"' % test_path) proc.communicate() button.connect(button, SIGNAL('clicked()'), launch_proc) button.show() app.exec_() os.remove(test_path) Qt provides a process wrapper of its own called QProcess which can help with this. You can connect functions to signals to capture output relatively easily. This is what I'm currently using. But I'm finding that all these signals behave suspiciously like goto statements and can lead to spaghetti code. I think I want to get sort-of blocking behavior by having the 'finished' signal from QProcess call a function containing all the code that comes after the process call. I think that should work but I'm still a bit fuzzy on the details... Stack traces get interrupted when you go from the child process back to the parent process. If a normal function screws up, you get a nice complete stack trace with filenames and line numbers. If a subprocess screws up, you'll be lucky if you get any output at all. You end up having to do a lot more detective work everytime something goes wrong. Speaking of which, output has a way of disappearing when dealing external processes. Like if you run something via the windows 'cmd' command, the console will pop up, execute the code, and then disappear before you have a chance to see the output. You have to pass the /k flag to make it stick around. Similar issues seem to crop up all the time. I suppose both problems 3 and 4 have the same root cause: no exception handling. Exception handling is meant to be used with functions, it doesn't work with processes. Maybe there's some way to get something like exception handling for processes? I guess that's what stderr is for? But dealing with two different streams can be annoying in itself. Maybe I should look into this more... Processes can hang and stick around in the background without you realizing it. So you end up yelling at your computer cuz it's going so slow until you finally bring up your task manager and see 30 instances of the same process hanging out in the background. Also, hanging background processes can interefere with other instances of the process in various fun ways, such as causing permissions errors by holding a handle to a file or someting like that. It seems like an easy solution to this would be to have the parent process kill the child process on exit if the child process didn't close itself. But if the parent process crashes, cleanup code might not get called and the child can be left hanging. Also, if the parent waits for the child to complete, and the child is in an infinite loop or something, you can end up with two hanging processes. This problem can tie in to problem 2 for extra fun, causing your gui to stop responding entirely and force you to kill everything with the task manager. F***ing quotes Parameters often need to be passed to processes. This is a headache in itself. Especially if you're dealing with file paths. Say... 'C:/My Documents/whatever/'. If you don't have quotes, the string will often be split at the space and interpreted as two arguments. If you need nested quotes you can use ' and ". But if you need to use more than two layers of quotes, you have to do some nasty escaping, for example: "cmd /k 'python \'path 1\' \'path 2\''". A good solution to this problem is passing parameters as a list rather than as a single string. Subprocess allows you to do this. Can't easily return data from a subprocess. You can use stdout of course. But what if you want to throw a print in there for debugging purposes? That's gonna screw up the parent if it's expecting output formatted a certain way. In functions you can print one string and return another and everything works just fine. Obscure command-line flags and a crappy terminal based help system. These are problems I often run into when using os level apps. Like the /k flag I mentioned, for holding a cmd window open, who's idea was that? Unix apps don't tend to be much friendlier in this regard. Hopefully you can use google or StackOverflow to find the answer you need. But if not, you've got a lot of boring reading and frusterating trial and error to do. External factors. This one's kind of fuzzy. But when you leave the relatively sheltered harbor of your own scripts to deal with external processes you find yourself having to deal with the "outside world" to a much greater extent. And that's a scary place. All sorts of things can go wrong. Just to give a random example: the cwd in which a process is run can modify it's behavior. There are probably other issues, but those are the ones I've written down so far. Any other snags you'd like to add? Any suggestions for dealing with these problems?

    Read the article

  • Implementing a robust async stream reader for a console

    - by Jon
    I recently provided an answer to this question: C# - Realtime console output redirection. As often happens, explaining stuff (here "stuff" was how I tackled a similar problem) leads you to greater understanding and/or, as is the case here, "oops" moments. I realized that my solution, as implemented, has a bug. The bug has little practical importance, but it has an extremely large importance to me as a developer: I can't rest easy knowing that my code has the potential to blow up. Squashing the bug is the purpose of this question. I apologize for the long intro, so let's get dirty. I wanted to build a class that allows me to receive input from a Stream in an event-based manner. The stream, in my scenario, is guaranteed to be a FileStream and there is also an associated StreamReader already present to leverage. The public interface of the class is this: public class MyStreamManager { public event EventHandler<ConsoleOutputReadEventArgs> StandardOutputRead; public void StartSendingEvents(); public void StopSendingEvents(); } Obviously this specific scenario has to do with a console's standard output. StartSendingEvents and StopSendingEvents do what they advertise; for the purposes of this discussion, we can assume that events are always being sent without loss of generality. The class uses these two fields internally: protected readonly StringBuilder inputAccumulator = new StringBuilder(); protected readonly byte[] buffer = new byte[256]; The functionality of the class is implemented in the methods below. To get the ball rolling: public void StartSendingEvents(); { this.stopAutomation = false; this.BeginReadAsync(); } To read data out of the Stream without blocking, and also without requiring a carriage return char, BeginRead is called: protected void BeginReadAsync() { if (!this.stopAutomation) { this.StandardOutput.BaseStream.BeginRead( this.buffer, 0, this.buffer.Length, this.ReadHappened, null); } } The challenging part: BeginRead requires using a buffer. This means that when reading from the stream, it is possible that the bytes available to read ("incoming chunk") are larger than the buffer. Since we are only handing off data from the stream to a consumer, and that consumer may well have inside knowledge about the size and/or format of these chunks, I want to call event subscribers exactly once for each chunk. Otherwise the abstraction breaks down and the subscribers have to buffer the incoming data and reconstruct the chunks themselves using said knowledge. This is much less convenient to the calling code, and detracts from the usefulness of my class. Edit: There are comments below correctly stating that since the data is coming from a stream, there is absolutely nothing that the receiver can infer about the structure of the data unless it is fully prepared to parse it. What I am trying to do here is leverage the "flush the output" "structure" that the owner of the console imparts while writing on it. I am prepared to assume (better: allow my caller to have the option to assume) that the OS will pass me the data written between two flushes of the stream in exactly one piece. To this end, if the buffer is full after EndRead, we don't send its contents to subscribers immediately but instead append them to a StringBuilder. The contents of the StringBuilder are only sent back whenever there is no more to read from the stream (thus preserving the chunks). private void ReadHappened(IAsyncResult asyncResult) { var bytesRead = this.StandardOutput.BaseStream.EndRead(asyncResult); if (bytesRead == 0) { this.OnAutomationStopped(); return; } var input = this.StandardOutput.CurrentEncoding.GetString( this.buffer, 0, bytesRead); this.inputAccumulator.Append(input); if (bytesRead < this.buffer.Length) { this.OnInputRead(); // only send back if we 're sure we got it all } this.BeginReadAsync(); // continue "looping" with BeginRead } After any read which is not enough to fill the buffer, all accumulated data is sent to the subscribers: private void OnInputRead() { var handler = this.StandardOutputRead; if (handler == null) { return; } handler(this, new ConsoleOutputReadEventArgs(this.inputAccumulator.ToString())); this.inputAccumulator.Clear(); } (I know that as long as there are no subscribers the data gets accumulated forever. This is a deliberate decision). The good This scheme works almost perfectly: Async functionality without spawning any threads Very convenient to the calling code (just subscribe to an event) Maintains the "chunkiness" of the data; this allows the calling code to use inside knowledge of the data without doing any extra work Is almost agnostic to the buffer size (it will work correctly with any size buffer irrespective of the data being read) The bad That last almost is a very big one. Consider what happens when there is an incoming chunk with length exactly equal to the size of the buffer. The chunk will be read and buffered, but the event will not be triggered. This will be followed up by a BeginRead that expects to find more data belonging to the current chunk in order to send it back all in one piece, but... there will be no more data in the stream. In fact, as long as data is put into the stream in chunks with length exactly equal to the buffer size, the data will be buffered and the event will never be triggered. This scenario may be highly unlikely to occur in practice, especially since we can pick any number for the buffer size, but the problem is there. Solution? Unfortunately, after checking the available methods on FileStream and StreamReader, I can't find anything which lets me peek into the stream while also allowing async methods to be used on it. One "solution" would be to have a thread wait on a ManualResetEvent after the "buffer filled" condition is detected. If the event is not signaled (by the async callback) in a small amount of time, then more data from the stream will not be forthcoming and the data accumulated so far should be sent to subscribers. However, this introduces the need for another thread, requires thread synchronization, and is plain inelegant. Specifying a timeout for BeginRead would also suffice (call back into my code every now and then so I can check if there's data to be sent back; most of the time there will not be anything to do, so I expect the performance hit to be negligible). But it looks like timeouts are not supported in FileStream. Since I imagine that async calls with timeouts are an option in bare Win32, another approach might be to PInvoke the hell out of the problem. But this is also undesirable as it will introduce complexity and simply be a pain to code. Is there an elegant way to get around the problem? Thanks for being patient enough to read all of this.

    Read the article

  • Uneditable file and Unreadable(for further processing) file( WHY? ) after processing it through C++

    - by mgj
    Hi...:) This might look to be a very long question to you I understand, but trust me on this its not long. I am not able to identify why after processing this text is not being able to be read and edited. I tried using the ord() function in python to check if the text contains any Unicode characters( non ascii characters) apart from the ascii ones.. I found quite a number of them. I have a strong feeling that this could be due to the original text itself( The INPUT ). Input-File: Just copy paste it into a file "acle5v1.txt" The objective of this code below is to check for upper case characters and to convert it to lower case and also to remove all punctuations so that these words are taken for further processing for word alignment #include<iostrea> #include<fstream> #include<ctype.h> #include<cstring> using namespace std; ifstream fin2("acle5v1.txt"); ofstream fin3("acle5v1_op.txt"); ofstream fin4("chkcharadded.txt"); ofstream fin5("chkcharntadded.txt"); ofstream fin6("chkprintchar.txt"); ofstream fin7("chknonasci.txt"); ofstream fin8("nonprinchar.txt"); int main() { char ch,ch1; fin2.seekg(0); fin3.seekp(0); int flag = 0; while(!fin2.eof()) { ch1=ch; fin2.get(ch); if (isprint(ch))// if the character is printable flag = 1; if(flag) { fin6<<"Printable character:\t"<<ch<<"\t"<<(int)ch<<endl; flag = 0; } else { fin8<<"Non printable character caught:\t"<<ch<<"\t"<<int(ch)<<endl; } if( isalnum(ch) || ch == '@' || ch == ' ' )// checks for alpha numeric characters { fin4<<"char added: "<<ch<<"\tits ascii value: "<<int(ch)<<endl; if(isupper(ch)) { //tolower(ch); fin3<<(char)tolower(ch); } else { fin3<<ch; } } else if( ( ch=='\t' || ch=='.' || ch==',' || ch=='#' || ch=='?' || ch=='!' || ch=='"' || ch != ';' || ch != ':') && ch1 != ' ' ) { fin3<<' '; } else if( (ch=='\t' || ch=='.' || ch==',' || ch=='#' || ch=='?' || ch=='!' || ch=='"' || ch != ';' || ch != ':') && ch1 == ' ' ) { //fin3<<" '; } else if( !(int(ch)>=0 && int(ch)<=127) ) { fin5<<"Char of ascii within range not added: "<<ch<<"\tits ascii value: "<<int(ch)<<endl; } else { fin7<<"Non ascii character caught(could be a -ve value also)\t"<<ch<<int(ch)<<endl; } } return 0; } I have a similar code as the above written in python which gives me an otput which is again not readable and not editable The code in python looks like this: #!/usr/bin/python # -*- coding: UTF-8 -*- import sys input_file=sys.argv[1] output_file=sys.argv[2] list1=[] f=open(input_file) for line in f: line=line.strip() #line=line.rstrip('.') line=line.replace('.','') line=line.replace(',','') line=line.replace('#','') line=line.replace('?','') line=line.replace('!','') line=line.replace('"','') line=line.replace('?','') line=line.replace('|','') line = line.lower() list1.append(line) f.close() f1=open(output_file,'w') f1.write(' '.join(list1)) f1.close() the file takes ip and op at runtime.. as: python punc_remover.py acle5v1.txt acle5v1_op.txt The output of this file is in "acle5v1_op.txt" now after processing this particular output file is needed for further processing. This particular file "aclee5v1_op.txt" is the UNREADABLE Aand UNEDITABLE File that I am not being able to use for further processing. I need this for Word alignment in NLP. I tried readin this output with the following program #include<iostream> #include<fstream> using namespace std; ifstream fin1("acle5v1_op.txt"); ofstream fout1("chckread_acle5v1_op.txt"); ofstream fout2("chcknotread_acle5v1_op.txt"); int main() { char ch; int flag = 0; long int r = 0; long int nr = 0; while(!(fin1)) { fin1.get(ch); if(ch) { flag = 1; } if(flag) { fout1<<ch; flag = 0; r++; } else { fout2<<"Char not been able to be read from source file\n"; nr++; } } cout<<"Number of characters able to be read: "<<r; cout<<endl<<"Number of characters not been able to be read: "<<nr; return 0; } which prints the character if its readable and if not it doesn't print them but I observed the output of both the file is blank thus I could draw a conclusion that this file "acle5v1_op.txt" is UNREADABLE AND UNEDITABLE. Could you please help me on how to deal with this problem.. To tell you a bit about the statistics wrt the original input file "acle5v1.txt" file it has around 3441 lines in it and around 3 million characters in it. Keeping in mind the number of characters in the file you editor might/might not be able to manage to open the file.. I was able to open the file in gedit of Fedora 10 which I am currently using .. This is just to notify you that opening with a particular editor was not actually an issue at least in my case... Can I use scripting languages like Python and Perl to deal with this problem if Yes how? could please be specific on that regard as I am a novice to Perl and Python. Or could you please tell me how do I solve this problem using C++ itself.. Thank you...:) I am really looking forward to some help or guidance on how to go about this problem....

    Read the article

  • Can this Query can be corrected or different table structure needed? (question is clear, detailed, d

    - by sandeepan
    This is a bit lengthy but I have provided sufficient details and kept things very clear. Please see if you can help. (I will surely accept answer if it solves my problem) I am sure a person experienced with this can surely help or suggest me to decide the tables structure. About the system:- There are tutors who create classes A tags based search approach is being followed Tag relations are created/edited when new tutors registers/edits profile data and when tutors create classes (this makes tutors and classes searcheable).For simplicity, let us consider only tutor name and class name are the fields which are matched against search keywords. In this example, I am considering - tutor "Sandeepan Nath" has created a class called "first class" tutor "Bob Cratchit" has created a class called "new class" Desired search results- AND logic to be appied on the search keywords and match against class and tutor data(class name + tutor name), in other words, All those classes be shown such that all the search terms are present in the class name or its tutor name. Example to be clear - Searching "first class" returns class with id_wc = 1. Working Searching "Sandeepan class" should also return class with id_wc = 1. Not working in System 2. Problem with profile editing and searching To tell in one sentence, I am facing a conflict between the ease of profile edition (edition of tag relations when tutor profiles are edited) and the ease of search logic. In the beginning, we had one table structure and search was easy but tag edition logic was very clumsy and unmaintainable(Check System 1 in the section below) . So we created separate tag relations tables to make profile edition simpler but search has become difficult. Please dump the tables so that you can run the search query I have given below and see the results. System 1 (previous system - search easy - profile edition difficult):- Only one table called All_Tag_Relations table had the all the tag relations. The tags table below is common to both systems 1 and 2. CREATE TABLE IF NOT EXISTS `all_tag_relations` ( `id_tag_rel` int(10) NOT NULL AUTO_INCREMENT, `id_tag` int(10) unsigned NOT NULL DEFAULT '0', `id_tutor` int(10) DEFAULT NULL, `id_wc` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id_tag_rel`), KEY `All_Tag_Relations_FKIndex1` (`id_tag`), KEY `id_wc` (`id_wc`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `all_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`, `id_wc`) VALUES (1, 1, 1, NULL), (2, 2, 1, NULL), (3, 1, 1, 1), (4, 2, 1, 1), (5, 3, 1, 1), (6, 4, 1, 1), (7, 6, 2, NULL), (8, 7, 2, NULL), (9, 6, 2, 2), (10, 7, 2, 2), (11, 5, 2, 2), (12, 4, 2, 2); CREATE TABLE IF NOT EXISTS `tags` ( `id_tag` int(10) unsigned NOT NULL AUTO_INCREMENT, `tag` varchar(255) DEFAULT NULL, PRIMARY KEY (`id_tag`), UNIQUE KEY `tag` (`tag`), KEY `id_tag` (`id_tag`), KEY `tag_2` (`tag`), KEY `tag_3` (`tag`), KEY `tag_4` (`tag`), FULLTEXT KEY `tag_5` (`tag`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ; INSERT INTO `tags` (`id_tag`, `tag`) VALUES (1, 'Sandeepan'), (2, 'Nath'), (3, 'first'), (4, 'class'), (5, 'new'), (6, 'Bob'), (7, 'Cratchit'); Please note that for every class, the tag rels of its tutor have to be duplicated. Example, for class with id_wc=1, the tag rel records with id_tag_rel = 3 and 4 are actually extras if you compare with the tag rel records with id_tag_rel = 1 and 2. System 2 (present system - profile edition easy, search difficult) Two separate tables Tutors_Tag_Relations and Webclasses_Tag_Relations have the corresponding tag relations data (Please dump into a separate database)- CREATE TABLE IF NOT EXISTS `tutors_tag_relations` ( `id_tag_rel` int(10) NOT NULL AUTO_INCREMENT, `id_tag` int(10) unsigned NOT NULL DEFAULT '0', `id_tutor` int(10) DEFAULT NULL, PRIMARY KEY (`id_tag_rel`), KEY `All_Tag_Relations_FKIndex1` (`id_tag`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `tutors_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`) VALUES (1, 1, 1), (2, 2, 1), (3, 6, 2), (4, 7, 2); CREATE TABLE IF NOT EXISTS `webclasses_tag_relations` ( `id_tag_rel` int(10) NOT NULL AUTO_INCREMENT, `id_tag` int(10) unsigned NOT NULL DEFAULT '0', `id_tutor` int(10) DEFAULT NULL, `id_wc` int(10) DEFAULT NULL, PRIMARY KEY (`id_tag_rel`), KEY `webclasses_Tag_Relations_FKIndex1` (`id_tag`), KEY `id_wc` (`id_wc`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `webclasses_tag_relations` (`id_tag_rel`, `id_tag`, `id_tutor`, `id_wc`) VALUES (1, 3, 1, 1), (2, 4, 1, 1), (3, 5, 2, 2), (4, 4, 2, 2); CREATE TABLE IF NOT EXISTS `tags` ( `id_tag` int(10) unsigned NOT NULL AUTO_INCREMENT, `tag` varchar(255) DEFAULT NULL, PRIMARY KEY (`id_tag`), UNIQUE KEY `tag` (`tag`), KEY `id_tag` (`id_tag`), KEY `tag_2` (`tag`), KEY `tag_3` (`tag`), KEY `tag_4` (`tag`), FULLTEXT KEY `tag_5` (`tag`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ; INSERT INTO `tags` (`id_tag`, `tag`) VALUES (1, 'Sandeepan'), (2, 'Nath'), (3, 'first'), (4, 'class'), (5, 'new'), (6, 'Bob'), (7, 'Cratchit'); CREATE TABLE IF NOT EXISTS `all_tag_relations` ( `id_tag_rel` int(10) NOT NULL AUTO_INCREMENT, `id_tag` int(10) unsigned NOT NULL DEFAULT '0', `id_tutor` int(10) DEFAULT NULL, `id_wc` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id_tag_rel`), KEY `All_Tag_Relations_FKIndex1` (`id_tag`), KEY `id_wc` (`id_wc`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; insert into All_Tag_Relations select NULL,id_tag,id_tutor,NULL from Tutors_Tag_Relations; insert into All_Tag_Relations select NULL,id_tag,id_tutor,id_wc from Webclasses_Tag_Relations; Here you can see how easily tutor first name can be edited only in one place. But search has become really difficult, so on being advised to use a Temporary table, I am creating one at every search request, then dumping all the necessary data and then searching from it, I am creating this All_Tag_Relations table at search run time. Here I am just dumping all the data from the two tables Tutors_Tag_Relations and Webclasses_Tag_Relations. But, I am still not able to get classes if I search with tutor name This is the query which searches "first class". Running them on both the systems shows correct results (returns the class with id_wc = 1). SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =3)) AS key_1_total_matches, SUM(DISTINCT( wtagrels.id_tag =4)) AS key_2_total_matches FROM all_tag_relations AS wtagrels WHERE ( wtagrels.id_tag =3 OR wtagrels.id_tag =4 ) GROUP BY wtagrels.id_wc HAVING key_1_total_matches = 1 AND key_2_total_matches = 1 LIMIT 0, 20 But, searching for "Sandeepan class" works only with the 1st system Here is the query which searches "Sandeepan class" SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =1)) AS key_1_total_matches, SUM(DISTINCT( wtagrels.id_tag =4)) AS key_2_total_matches FROM all_tag_relations AS wtagrels WHERE ( wtagrels.id_tag =1 OR wtagrels.id_tag =4 ) GROUP BY wtagrels.id_wc HAVING key_1_total_matches = 1 AND key_2_total_matches = 1 LIMIT 0, 20 Can anybody alter this query and somehow do a proper join or something to get correct results. That solves my problem in a nice way. As you can figure out, the reason why it does not work in system 2 is that in system 1, for every class, one additional tag relation linking class and tutor name is present. e.g. for class first class, (records with id_tag_rel 3 and 4) which returns the class on searching with tutor name. So, you see the trade-off between the search and profile edition difficulty with the two systems. How do I overcome both. I have to reach a conclusion soon. So far my reasoning is it is definitely not good from a code maintainability point of view to follow the single tag rel table structure of system one, because in a real system while editing a field like "tutor qualifications", there can be as many records in tag rels table as there are words in qualification of a tutor (one word in a field = one tag relation). Now suppose a tutor has 100 classes. When he edits his qualification, all the tag rel rows corresponding to him are deleted and then as many copies are to be created (as per the new qualification data) as there are classes. This becomes particularly difficult if later more searcheable fields are added. The code cannot be robust. Is the best solution to follow system 2 (edition has to be in one table - no extra work for each and every class) and somehow re-create the all_tag_relations table like system 1 (from the tables tutor_tag_relations and webclasses_tag_relations), creating the extra tutor tag rels for each and every class by a tutor (which is currently missing in system 2's temporary all_tag_relations table). That would be a time consuming logic script. I doubt that table can be recreated without resorting to PHP sript (mysql alone cannot do that). But the problem is that running all this at search time will make search definitely slow. So, how do such systems work? How are such situations handled? I thought about we can run a cron which initiates that PHP script, say every 1 minute and replaces the existing all_tag_relations table as per new tag rels from tutor_tag_relations and webclasses_tag_relations (replaces means creates a new table, deletes the original and renames the new one as all_tag_relations, otherwise search won't work during that period- or is there any better way to that?). Anyway, the result would be that any changes by tutors will reflect in search in the next 1 minute and not immediately. An alternateve would be to initate that PHP script every time a tutor edits his profile. But here again, since many users may edit their profiles concurrently, will the creation of so many tables be a burden and can mysql make the server slow? Any help would be appreciated and working solution will be accepted as answer. Thanks, Sandeepan

    Read the article

  • How Should I Generate Trade Statistics For CouchDB/Rails3 Application?

    - by James
    My Problem: I am trying to developing a web application for currency traders. The application allows traders to enter or upload information about their trades and I want to calculate a wide variety of statistics based on what the user entered. Now, normally I would use a relational database for this, but I have two requirements that don't fit well with a relational database so I am attempting to use couchdb. Those two problems are: 1) Primarily, I have a companion desktop application that users will be able to work with and replicate to the site using couchdb's awesome replication feature and 2) I would like to allow users to be able to define their own custom things to track about trades and generate results based off of what they enter. The schema less nature of couch seems perfect here, but it may end up being harder than it sounds. (I already know couch requires you to define views in advance and such so I was just planning on sticking all the custom attributes in an array and then emitting the array in the view and further processing from there.) What I Am Doing: Right now I am just emitting each trade in couch keyed by each user's system and querying with the key of the system to get an array of trades per system. Simple. I am not using a reduce function currently to calculate any stats because I couldn't figure out how to get everything I need without getting a reduce overflow error. Here is an example of rows that are getting emitted from couch: {"total_rows":134,"offset":0,"rows":[ {"id":"5b1dcd47221e160d8721feee4ccc64be", "key":["80e40ba2fa43589d57ec3f1d19db41e6","2010/05/14 04:32:37 +0000"], null, "doc":{ "_id":"5b1dcd47221e160d8721feee4ccc64be", "_rev":"1-bc9fe763e2637694df47d6f5efb58e5b", "couchrest-type":"Trade", "system":"80e40ba2fa43589d57ec3f1d19db41e6", "pair":"EUR/USD", "direction":"Buy", "entry":12600, "exit":12700, "stop_loss":12500, "profit_target":12700, "status":"Closed", "slug":"101332132375", "custom_tracking": [{"name":"signal", "value":"Pin Bar"}] "updated_at":"2010/05/14 04:32:37 +0000", "created_at":"2010/05/14 04:32:37 +0000", "result":100}} ]} In my rails 3 controller I am basically just populating an array of trades such as the one above and then extracting out the relevant data into smaller arrays that I can compute my statistics on. Here is my show action for the page that I want to display the stats and all the trades: def show @trades = Trade.by_system(:startkey => [@system.id], :endkey => [@system.id, Time.now ]) @trades.each do |trade| if trade.result > 0 @winning_trades << trade.result elsif trade.result < 0 @losing_trades << trade.result else @breakeven_trades << trade.result end if trade.direction == "Buy" @long_trades << trade.result else @short_trades << trade.result end if trade["custom_tracking"] != nil @custom_tracking << {"result" => trade.result, "variables" => trade["custom_tracking"]} end end end I am omitting some other stuff that is going on, but that is the gist of what I am doing. Then I am calculating stuff in the view layer to produce some results: <% winning_long_trades = @long_trades.reject {|trade| trade <= 0 } %> <% winning_short_trades = @short_trades.reject {|trade| trade <= 0 } %> <ul> <li>Total Trades: <%= @trades.count %></li> <li>Winners: <%= @winning_trades.size %></li> <li>Biggest Winner (Pips): <%= @winning_trades.max %></li> <li>Average Win(Pips): <%= @winning_trades.sum/@winning_trades.size %></li> <li>Losers: <%= @losing_trades.size %></li> <li>Biggest Loser (Pips): <%= @losing_trades.min %></li> <li>Average Loss(Pips): <%= @losing_trades.sum/@losing_trades.size %></li> <li>Breakeven Trades: <%= @breakeven_trades.size %></li> <li>Long Trades: <%= @long_trades.size %></li> <li>Winning Long Trades: <%= winning_long_trades.size %></li> <li>Short Trades: <%= @short_trades.size %></li> <li>Winning Short Trades: <%= winning_short_trades.size %></li> <li>Total Pips: <%= @winning_trades.sum + @losing_trades.sum %></li> <li>Win Rate (%): <%= @winning_trades.size/@trades.count.to_f * 100 %></li> </ul> This produces the following results, which aside from a few things is exactly what I want: Total Trades: 134 Winners: 70 Biggest Winner (Pips): 1488 Average Win(Pips): 440 Losers: 58 Biggest Loser (Pips): -516 Average Loss(Pips): -225 Breakeven Trades: 6 Long Trades: 125 Winning Long Trades: 67 Short Trades: 9 Winning Short Trades: 3 Total Pips: 17819 Win Rate (%): 52.23880597014925 What I Am Wondering- Finally The Actual Questions: I am starting to get really skeptical of how well this method will work when a user has 5,000 trades instead of just 134 like in this example. I anticipate most users will only have somewhere under 200 per year, but some users may have a couple thousand trades per year. Probably no more than 5,000 per year. It seems to work ok now, but the page load times are already getting a tad high for my tastes. (About 800ms to generate the page according to rails logs with about a 250ms of that spent in the view layer.) I will end up caching this page I am sure, but I still need the regenerate the page each time a trade is updated and I can't afford to have this be too slow. Sooo..... Is doing something similar here possible with a straight couchdb reduce function? I am assuming handing this off to couch would possibly help with larger data sets. I couldn't figure out how, but I suppose that doesn't mean it isn't possible. If possible, any hints will be helpful. Could I use a list function if a reduce was not available due to reduce constraints? Are couchdb list functions suitable for this type of calculations? Anyone have any idea of whether or not list functions perform well? Any hints what one would look like for the type of calculations I am trying to achieve? I thought about other options such as running the calculations at the time each trade was saved or nightly if I had to and saving the results to a statistics doc that I could then query so that all the processing was done ahead of time. I would like this to be the last resort because then I can't really filter out trades by time periods dynamically like I would really like to. (I want to have a slider that a user can slide to only show trades from that time period using the startkey and endkey in couchdb if I can.) If I should continue running the calculations inside the rails app at the time of the page view, what can I do to improve my current implementation. I am new to rails, couch and programming in general. I am sure that I could be doing something better here. Do I need to create an array for each stat or is there a better way to do that. I guess I just would really like some advice on how to tackle this problem. I want to keep the page generation time minimal since I anticipate these being some of the highest trafficked pages. My gut is that I will need to offload the statistics calculation to either couch or run the stats in advance of when they are called, but I am not sure. Lastly: Like I mentioned above, one of the primary reasons for using couch is to allow users to define their own things to track per trade. Getting the data into couch is no problem, but how would I be able to take the custom_tracking array and find how many winning trades for each named tracking attribute. If anyone can give me any hints to the possibility of doing this that would be great. Thanks a bunch. Would really appreciate any help. Willing to fork out some $$$ if someone wants to take on the problem for me. (Don't know if that is allowed on stack overflow or not.)

    Read the article

  • What does it mean when ARP shows <incomplete> on eth1

    - by Geoff Dalgas
    We have been using HAProxy along with heartbeat from the Linux-HA project. We are using two linux instances to provide a failover. Each server has with their own public IP and a single IP which is shared between the two using a virtual interface (eth1:1) at IP: 69.59.196.211 The virtual interface (eth1:1) IP 69.59.196.211 is configured as the gateway for the windows servers behind them and we use ip_forwarding to route traffic. We are experiencing an occasional network outage on one of our windows servers behind our linux gateways. HAProxy will detect the server is offline which we can verify by remoting to the failed server and attempting to ping the gateway: Pinging 69.59.196.211 with 32 bytes of data: Reply from 69.59.196.220: Destination host unreachable. Running arp -a on this failed server shows that there is no entry for the gateway address (69.59.196.211): Interface: 69.59.196.220 --- 0xa Internet Address Physical Address Type 69.59.196.161 00-26-88-63-c7-80 dynamic 69.59.196.210 00-15-5d-0a-3e-0e dynamic 69.59.196.212 00-21-5e-4d-45-c9 dynamic 69.59.196.213 00-15-5d-00-b2-0d dynamic 69.59.196.215 00-21-5e-4d-61-1a dynamic 69.59.196.217 00-21-5e-4d-2c-e8 dynamic 69.59.196.219 00-21-5e-4d-38-e5 dynamic 69.59.196.221 00-15-5d-00-b2-0d dynamic 69.59.196.222 00-15-5d-0a-3e-09 dynamic 69.59.196.223 ff-ff-ff-ff-ff-ff static 224.0.0.22 01-00-5e-00-00-16 static 224.0.0.252 01-00-5e-00-00-fc static 225.0.0.1 01-00-5e-00-00-01 static On our linux gateway instances arp -a shows: peak-colo-196-220.peak.org (69.59.196.220) at <incomplete> on eth1 stackoverflow.com (69.59.196.212) at 00:21:5e:4d:45:c9 [ether] on eth1 peak-colo-196-215.peak.org (69.59.196.215) at 00:21:5e:4d:61:1a [ether] on eth1 peak-colo-196-219.peak.org (69.59.196.219) at 00:21:5e:4d:38:e5 [ether] on eth1 peak-colo-196-222.peak.org (69.59.196.222) at 00:15:5d:0a:3e:09 [ether] on eth1 peak-colo-196-209.peak.org (69.59.196.209) at 00:26:88:63:c7:80 [ether] on eth1 peak-colo-196-217.peak.org (69.59.196.217) at 00:21:5e:4d:2c:e8 [ether] on eth1 Why would arp occasionally set the entry for this failed server as <incomplete>? Should we be defining our arp entries statically? I've always left arp alone since it works 99% of the time, but in this one instance it appears to be failing. Are there any additional troubleshooting steps we can take help resolve this issue? THINGS WE HAVE TRIED I added a static arp entry for testing on one of the linux gateways which still didn't help. root@haproxy2:~# arp -a peak-colo-196-215.peak.org (69.59.196.215) at 00:21:5e:4d:61:1a [ether] on eth1 peak-colo-196-221.peak.org (69.59.196.221) at 00:15:5d:00:b2:0d [ether] on eth1 stackoverflow.com (69.59.196.212) at 00:21:5e:4d:45:c9 [ether] on eth1 peak-colo-196-219.peak.org (69.59.196.219) at 00:21:5e:4d:38:e5 [ether] on eth1 peak-colo-196-209.peak.org (69.59.196.209) at 00:26:88:63:c7:80 [ether] on eth1 peak-colo-196-217.peak.org (69.59.196.217) at 00:21:5e:4d:2c:e8 [ether] on eth1 peak-colo-196-220.peak.org (69.59.196.220) at 00:21:5e:4d:30:8d [ether] PERM on eth1 root@haproxy2:~# arp -i eth1 -s 69.59.196.220 00:21:5e:4d:30:8d root@haproxy2:~# ping 69.59.196.220 PING 69.59.196.220 (69.59.196.220) 56(84) bytes of data. --- 69.59.196.220 ping statistics --- 7 packets transmitted, 0 received, 100% packet loss, time 6006ms Rebooting the windows web server solves this issue temporarily with no other changes to the network but our experience shows this issue will come back. Swapping network cards and switches I noticed the link light on the port of the switch for the failed windows server was running at 100Mb instead of 1Gb on the failed interface. I moved the cable to several other open ports and the link indicated 100Mb for each port that I tried. I also swapped the cable with the same result. I tried changing the properties of the network card in windows and the server locked up and required a hard reset after clicking apply. This windows server has two physical network interfaces so I have swapped the cables and network settings on the two interfaces to see if the problem follows the interface. If the public interface goes down again we will know that it is not an issue with the network card. (We also tried another switch we have on hand, no change) Changing network hardware driver versions We've had the same problem with the latest Broadcom driver, as well as the built-in driver that ships in Windows Server 2008 R2. Replacing network cables As a last ditch effort we remembered another change that occurred was the replacement of all of the patch cords between our servers / switch. We had purchased two sets, one green of lengths 1ft - 3ft for the private interfaces and another set of red cables for the public interfaces. We swapped out all of the public interface patch cables with a different brand and ran our servers without issue for a full week ... aaaaaand then the problem recurred. Disable checksum offload, remove TProxy We also tried disabling TCP/IP checksum offload in the driver, no change. We're now pulling out TProxy and moving to a more traditional x-forwarded-for network arrangement without any fancy IP address rewriting. We'll see if that helps.

    Read the article

  • "power limit notification" clobbering on 12G Dell servers with RHEL6

    - by Andrew B
    Server: Poweredge r620 OS: RHEL 6.4 Kernel: 2.6.32-358.18.1.el6.x86_64 I'm experiencing application alarms in my production environment. Critical CPU hungry processes are being starved of resources and causing a processing backlog. The problem is happening on all the 12th Generation Dell servers (r620s) in a recently deployed cluster. As near as I can tell, instances of this happening are matching up to peak CPU utilization, accompanied by massive amounts of "power limit notification" spam in dmesg. An excerpt of one of these events: Nov 7 10:15:15 someserver [.crit] CPU12: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU0: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU6: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU14: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU18: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU2: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU4: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU16: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU0: Package power limit notification (total events = 11) Nov 7 10:15:15 someserver [.crit] CPU6: Package power limit notification (total events = 13) Nov 7 10:15:15 someserver [.crit] CPU14: Package power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU18: Package power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU20: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU8: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU2: Package power limit notification (total events = 12) Nov 7 10:15:15 someserver [.crit] CPU10: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU22: Core power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU4: Package power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU16: Package power limit notification (total events = 13) Nov 7 10:15:15 someserver [.crit] CPU20: Package power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU8: Package power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU10: Package power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU22: Package power limit notification (total events = 14) Nov 7 10:15:15 someserver [.crit] CPU15: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU3: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU1: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU5: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU17: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU13: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU15: Package power limit notification (total events = 375) Nov 7 10:15:15 someserver [.crit] CPU3: Package power limit notification (total events = 374) Nov 7 10:15:15 someserver [.crit] CPU1: Package power limit notification (total events = 376) Nov 7 10:15:15 someserver [.crit] CPU5: Package power limit notification (total events = 376) Nov 7 10:15:15 someserver [.crit] CPU7: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU19: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU17: Package power limit notification (total events = 377) Nov 7 10:15:15 someserver [.crit] CPU9: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU21: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU23: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU11: Core power limit notification (total events = 369) Nov 7 10:15:15 someserver [.crit] CPU13: Package power limit notification (total events = 376) Nov 7 10:15:15 someserver [.crit] CPU7: Package power limit notification (total events = 375) Nov 7 10:15:15 someserver [.crit] CPU19: Package power limit notification (total events = 375) Nov 7 10:15:15 someserver [.crit] CPU9: Package power limit notification (total events = 374) Nov 7 10:15:15 someserver [.crit] CPU21: Package power limit notification (total events = 375) Nov 7 10:15:15 someserver [.crit] CPU23: Package power limit notification (total events = 374) A little Google Fu reveals that this is typically associated with the CPU running hot, or voltage regulation kicking in. I don't think that's what is happening though. Temperature sensors for all servers in the cluster are running fine, Power Cap Policy is disabled in the iDRAC, and my System Profile is set to "Performance" on all of these servers: # omreport chassis biossetup | grep -A10 'System Profile' System Profile Settings ------------------------------------------ System Profile : Performance CPU Power Management : Maximum Performance Memory Frequency : Maximum Performance Turbo Boost : Enabled C1E : Disabled C States : Disabled Monitor/Mwait : Enabled Memory Patrol Scrub : Standard Memory Refresh Rate : 1x Memory Operating Voltage : Auto Collaborative CPU Performance Control : Disabled A Dell mailing list post describes the symptoms almost perfectly. Dell suggested that the author try using the Performance profile, but that didn't help. He ended up applying some settings in Dell's guide for configuring a server for low latency environments and one of those settings (or a combination thereof) seems to have fixed the problem. Kernel.org bug #36182 notes that power-limit interrupt debugging was enabled by default, which is causing performance degradation in scenarios where CPU voltage regulation is kicking in. A RHN KB article (RHN login required) mentions a problem impacting PE r620 and r720 servers not running the Performance profile, and recommends an update to a kernel released two weeks ago. ...Except we are running the Performance profile... Everything I can find online is running me in circles here. What's the heck is going on?

    Read the article

  • PHP pages working slow from time to time

    - by user1038179
    I have VPS with limit of 2GB of ram and 8 CPU cores. I have 5 sites on that VPS (one of them is just for testing, no visitors exept me). All 5 sites are image galleries, like wallpaper sites. Last week I noticed problem on one site (main domain, used for name servers, and also with most traffic, visitors). That site has two image galleries, one is old static html gallery made few years ago and another, main, is powered by ZENPhoto CMS. Also I have that same gallery CMS on another two sites on that same VPS (on one running site and on one just for testing site). On other two sites I have diferent PHP driven gallery. Problem is that after some time (it vary from 10 minutes to few hours after apache restart), loading of pages on main site becomes very slow, or I get 503 Service Temporarily Unavailable error. So pages becomes unavailable. But just that part with new CMS gallery, old part of site with static html pages are working fast and just fine. Also other two sites with same CMS gallery and other two with different PHP driven gallery are working fine and fast at the same time. I thought it must be something with CMS on that main site, because other sites are working nice. Then I tryed to open contact and guest book pages on that main site which are outside of that CMS but also PHP pages, and they do not load too, but that same contact php scipts are working on other sites at the same time. So, when site starts to hangs, ONLY PHP generated content is not working, like I said other static pages are working. And, ONLY on that one main site I have problems. Then I need to restart Apache, after restart everything is vorking nice and fast, for some time, than again, just PHP pages on main site are becomming slower. If I do not restart apache that slowness take some time (several minutes, hours, depending ot traffic) and during that time PHP diven content is loading very slow or unavailable on that site. After sime time, on moments everything start to work and is fast again for some time, and again. In hours with more traffic PHP content is loading slowly or it is unavailable, in hours with less traffic it is sometimes fast and sometimes little bit slower than usually. And ones again, only on that main site, and only PHP driven pages, static pages are working fast even in most traffic hours also other sites with even same CMS are working fast. Currently I have about 7000 unique visitors on that site but site worked nice even with 11500 visitors per day. And about 17000 in total visitors on VPS, all sites ( about 3 pages per unique visitor). When site start to slow down sometimes in apache status I can see something like this: mod_fcgid status: Total FastCGI processes: 37 Process: php5 (/usr/local/cpanel/cgi-sys/php5)Pid Active Idle Accesses State 11300 39 28 7 Working 11274 47 28 7 Working 11296 40 29 3 Working 11283 45 30 3 Working 11304 36 31 1 Working 11282 46 32 3 Working 11292 42 33 1 Working 11289 44 34 1 Working 11305 35 35 0 Working 11273 48 36 2 Working 11280 47 39 1 Working 10125 133 40 12 Exiting(communication error) 11294 41 41 1 Exiting(communication error) 11277 47 42 2 Exiting(communication error) 11291 43 43 1 Exiting(communication error) 10187 108 43 10 Exiting(communication error) 10209 95 44 7 Exiting(communication error) 10171 113 44 5 Exiting(communication error) 11275 47 47 1 Exiting(communication error) 10144 125 48 8 Exiting(communication error) 10086 149 48 20 Exiting(communication error) 10212 94 49 5 Exiting(communication error) 10158 118 49 5 Exiting(communication error) 10169 114 50 4 Exiting(communication error) 10105 141 50 16 Exiting(communication error) 10094 146 50 15 Exiting(communication error) 10115 139 51 17 Exiting(communication error) 10213 93 51 9 Exiting(communication error) 10197 103 51 7 Exiting(communication error) Process: php5 (/usr/local/cpanel/cgi-sys/php5)Pid Active Idle Accesses State 7983 1079 2 149 Ready 7979 1079 11 151 Ready Process: php5 (/usr/local/cpanel/cgi-sys/php5)Pid Active Idle Accesses State 7990 1066 0 57 Ready 8001 1031 64 35 Ready 7999 1032 94 29 Ready 8000 1031 91 36 Ready 8002 1029 34 52 Ready Process: php5 (/usr/local/cpanel/cgi-sys/php5)Pid Active Idle Accesses State 7991 1064 29 115 Ready When it is working nicly there is no lines with "Exiting(communication error)" Active and Idle are time active and time since last request, in seconds. Here are system info. Sysem info: Total processors: 8 Processor #1 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5440 @ 2.83GHz Speed 88.320 MHz Cache 6144 KB All other seven are the same. System Information Linux vps.nnnnnnnnnnnnnnnnn.nnn 2.6.18-028stab099.3 #1 SMP Wed Mar 7 15:20:22 MSK 2012 x86_64 x86_64 x86_64 GNU/Linux Current Memory Usage total used free shared buffers cached Mem: 8388608 882164 7506444 0 0 0 -/+ buffers/cache: 882164 7506444 Swap: 0 0 0 Total: 8388608 882164 7506444 Current Disk Usage Filesystem Size Used Avail Use% Mounted on /dev/vzfs 100G 34G 67G 34% / none System Details: Running on: Apache/2.2.22 System info: (Unix) mod_ssl/2.2.22 OpenSSL/0.9.8e-fips-rhel5 DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635 mod_fcgid/2.3.6 Powered by: PHP/5.3.10 Current Configuration Default PHP Version (.php files) 5 PHP 5 Handler fcgi PHP 4 Handler suphp Apache suEXEC on Apache Ruid2 off PHP 4 Handler suphp Apache suEXEC on Apache Configuration The following settings have been saved: fileetag: All keepalive: On keepalivetimeout: 3 maxclients: 150 maxkeepaliverequests: 10 maxrequestsperchild: 10000 maxspareservers: 10 minspareservers: 5 root_options: ExecCGI, FollowSymLinks, Includes, IncludesNOEXEC, Indexes, MultiViews, SymLinksIfOwnerMatch serverlimit: 256 serversignature: Off servertokens: Full sslciphersuite: ALL:!ADH:RC4+RSA:+HIGH:+MEDIUM:-LOW:-SSLv2:-EXP:!kEDH startservers: 5 timeout: 30 I hope, I explained my problem nicely. Any help would be nice.

    Read the article

  • Windows 7 Machine Makes Router Drop -All- Wireless Connections

    - by Hammer Bro.
    Some background: My home network consists of my Desktop, a two-month old Windows 7 (x64) machine which is online most frequently (N-spec), as well as three other Windows XP laptops (all G) that only connect every now and then (one for work, one for Netflix, and the other for infrequent regular laptop uses). I used to have a Belkin F5D8236-4 wireless router, and everything worked great. A week ago, however, I found out that the Belkin absolutely in no way would establish a VPN connection, something that has become important for work. So I bought a Netgear WNR3500v2/U/L. The wireless was acting a little sketchy at first for just the Windows 7 machine, but I thought it had something to do with 802.11N and I was in a hurry so I just fished up an ethernet cable and disabled the computer's wireless. It has now become apparent, though, that whenever the Windows 7 machine is connected to the router, all wireless connections become unstable. I was using my work laptop for a solid six hours today with no trouble, having multiple SSH connections open over VPN and streaming internet radio in the background. Then, within two minutes of turning on this Windows 7 box, I had lost all connectivity over the wireless. And I was two feet away from the router. The same sort of thing happens on all of the other laptops -- Netflix can be playing stuff all weekend, but if I come up here and do things on this (W7) computer, the streaming will be dead within ten minutes. So here are my basic observations: If the Windows 7 machine is off, then all connections will have a Signal Strength of Very Good or Excellent and a Speed of 48-54 Mbps for an indefinite amount of time. Shortly after the Windows 7 machine is turned on, all wireless connections will experience a consistent decline in Speed down to 1.0 Mbps, eventually losing their connection entirely. These machines will continue to maintain 70% signal strength, as observed by themselves and router. Once dropped, a wireless connection will have difficulty reconnecting. And, if a connection manages to become established, it will quickly drop off again. The Windows 7 machine itself will continue to function just fine if it's using a wired connection, although it will experience these same issues over the wireless. All of the drivers and firmwares are up to date, and this happened both with the stock Netgear firmware as well as the (current) DD-WRT. What I've tried: Making sure each computer is being assigned a distinct IP. (They are.) Disabling UPnP and Stateful Packet Inspection on the router. Disabling Network Sharing, SSDP Discovery, TCP/IP NetBios Helper and Computer Browser services on the Windows 7 machine. Disabling QoS Packet Scheduler, IPv6, and Link Layer Topology Discovery options on my ethernet controller (leaving only Client for Microsoft Networks, File and Printer Sharing, and IPv4 enabled). What I think: It seems awfully similar to the problems discussed in detail at http://social.msdn.microsoft.com/Forums/en/wsk/thread/1064e397-9d9b-4ae2-bc8e-c8798e591915 (which was both the most relevant and concrete information I could dig up on the internet). I still think that something the Windows 7 IP stack (or just Operating System itself) is doing is giving the router fits. However, I could be wrong, because I have two key differences. One is that most instances of this problem are reported as the entire router dying or restarting, and mine still works just fine over the wired connection. The other is that it's a new router, tested with both the factory firmware and the (I assume) well-maintained DD-WRT project. Even if Windows 7 is still secretly sending IPv6 packets or the TCP Window Scaling implementation that I hear Vista caused some trouble with (even though I've tried my best to disable anything fancy), this router should support those functions. I don't want to get a new or a replacement router unless someone can convince me that this is a defective unit. But the problem seems too specific and predictable by my instincts to be a hardware hiccup. And I don't want to deal with the inevitable problems that always seem to take half a day to resolve when getting a new router, since I'm frantically working (including tomorrow) to complete a project by next week's deadline. Plus, I think in the worst case scenario, I could keep this router connected directly to the modem, disable its wireless entirely, and connect the old Belkin to it directly. That should allow me to still use VPN (although I'll have to plug my work laptop directly into that router), and then maintain wireless connections for all of the other computers. But that feels so wrong to me. Anyone have any ideas what the cause and possible solution could be? Clarifications: The Windows 7 machine is directly connected via an ethernet cable to the router for everything above. But while it is online, all other computers' wireless connections become unusable. It is not an issue of signal strength or interference -- no other devices within scanning range are using Channel 1, and the problem will affect computers that are literally feet away from the router with 95% signal strength.

    Read the article

  • Load-balancing between a Procurve switch and a server

    - by vlad
    Hello I've been searching around the web for this problem i've been having. It's similar in a way to this question: How exactly & specifically does layer 3 LACP destination address hashing work? My setup is as follows: I have a central switch, a Procurve 2510G-24, image version Y.11.16. It's the center of a star topology, there are four switches connected to it via a single gigabit link. Those switches service the users. On the central switch, I have a server with two gigabit interfaces that I want to bond together in order to achieve higher throughput, and two other servers that have single gigabit connections to the switch. The topology looks as follows: sw1 sw2 sw3 sw4 | | | | --------------------- | sw0 | --------------------- || | | srv1 srv2 srv3 The servers were running FreeBSD 8.1. On srv1 I set up a lagg interface using the lacp protocol, and on the switch I set up a trunk for the two ports using lacp as well. The switch showed that the server was a lacp partner, I could ping the server from another computer, and the server could ping other computers. If I unplugged one of the cables, the connection would keep working, so everything looked fine. Until I tested throughput. There was only one link used between srv1 and sw0. All testing was conducted with iperf, and load distribution was checked with systat -ifstat. I was looking to test the load balancing for both receive and send operations, as I want this server to be a file server. There were therefore two scenarios: iperf -s on srv1 and iperf -c on the other servers iperf -s on the other servers and iperf -c on srv1 connected to all the other servers. Every time only one link was used. If one cable was unplugged, the connections would keep going. However, once the cable was plugged back in, the load was not distributed. Each and every server is able to fill the gigabit link. In one-to-one test scenarios, iperf was reporting around 940Mbps. The CPU usage was around 20%, which means that the servers could withstand a doubling of the throughput. srv1 is a dell poweredge sc1425 with onboard intel 82541GI nics (em driver on freebsd). After troubleshooting a previous problem with vlan tagging on top of a lagg interface, it turned out that the em could not support this. So I figured that maybe something else is wrong with the em drivers and / or lagg stack, so I started up backtrack 4r2 on this same server. So srv1 now uses linux kernel 2.6.35.8. I set up a bonding interface bond0. The kernel module was loaded with option mode=4 in order to get lacp. The switch was happy with the link, I could ping to and from the server. I could even put vlans on top of the bonding interface. However, only half the problem was solved: if I used srv1 as a client to the other servers, iperf was reporting around 940Mbps for each connection, and bwm-ng showed, of course, a nice distribution of the load between the two nics; if I run the iperf server on srv1 and tried to connect with the other servers, there was no load balancing. I thought that maybe I was out of luck and the hashes for the two mac addresses of the clients were the same, so I brought in two new servers and tested with the four of them at the same time, and still nothing changed. I tried disabling and reenabling one of the links, and all that happened was the traffic switched from one link to the other and back to the first again. I also tried setting the trunk to "plain trunk mode" on the switch, and experimented with other bonding modes (roundrobin, xor, alb, tlb) but I never saw any traffic distribution. One interesting thing, though: one of the four switches is a Cisco 2950, image version 12.1(22)EA7. It has 48 10/100 ports and 2 gigabit uplinks. I have a server (call it srv4) with a 4 channel trunk connected to it (4x100), FreeBSD 8.0 release. The switch is connected to sw0 via gigabit. If I set up an iperf server on one of the servers connected to sw0 and a client on srv4, ALL 4 links are used, and iperf reports around 330Mbps. systat -ifstat shows all four interfaces are used. The cisco port-channel uses src-mac to balance the load. The HP should use both the source and destination according to the manual, so it should work as well. Could this mean there is some bug in the HP firmware? Am I doing something wrong?

    Read the article

  • Bind9 as a caching resolver fails with mismatch ID on localhost but not external IP

    - by argibbs
    I'm running Ubuntu 12.04 LTS on a machine on my private network. I have bind9 installed (v9.8.1-P1) via aptitude, so it appears to have put all the bits in the right places and the service starts automatically. I plan on adding some zones later, but first I'm just trying to get it working as a caching resolver. I installed bind, configured it, and starting using it. Initially I thought it was working ok, but then I found some sites weren't being resolved. I've pinned it down to being linked to the size of the result and bind failing-over to TCP mode. So: I'm trying to find out why bind is failing when I query for domain info and the result is 512 bytes (causing a truncation and retry on TCP). Specifically it fails with ID mismatches if I point dig at localhost, but works when I query the machine's own IP (192.168.0.2). This appears to be backwards to the problem that most people have when using bind (fails on external ip, works on localhost). If I do dig @localhost google.com (which has a response of <512 bytes) then it works; I get no warnings, and plenty of output. $ dig @localhost google.com ; <<>> DiG 9.8.1-P1 <<>> @localhost google.com [snip lots of output] ;; Query time: 39 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Thu Oct 17 23:08:34 2013 ;; MSG SIZE rcvd: 495 If I do dig @localhost play.google.com (which has a larger response) then I get back something like: $ dig @localhost play.google.com ;; Truncated, retrying in TCP mode. ;; ERROR: ID mismatch: expected ID 3696, got 27130 This seems to be standard, documented behaviour - when the UDP response is large (here 'large' == 512 bytes) it falls back to TCP. The ID mismatch is not expected though. If I do dig @192.168.0.2 play.google.com then I still get the warning about using TCP mode, but it otherwise works $ dig @192.168.0.2 play.google.com ;; Truncated, retrying in TCP mode. ; <<>> DiG 9.8.1-P1 <<>> @192.168.0.2 play.google.com [snip most of the output] ;; Query time: 5 msec ;; SERVER: 192.168.0.2#53(192.168.0.2) ;; WHEN: Thu Oct 17 23:05:55 2013 ;; MSG SIZE rcvd: 521 At the moment I've not set up any zones in my local instance, so it's just acting as a caching resolver. My options config is pretty much unchanged from standard, I've got the following set: options { directory "/var/cache/bind"; allow-query { 192.168/16; 127.0.0.1; }; forwarders { 8.8.8.8; 8.8.4.4; }; dnssec-validation auto; edns-udp-size 4096 ; allow-transfer { any; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; And my /etc/resolv.conf is just nameserver 127.0.0.1 search .local The problem definitely seems linked to the failover to TCP mode: if I do dig +bufsize=4096 @localhost play.google.com then it works; no warning about failover to TCP, no ID mismatch, and a standard looking result. To be honest, if there was a way to force bind to use a much larger UDP buffer, that'd probably be good enough for me, but all I've been able to find mention of is max-udp-size 4096 and that doesn't change the behaviour in any way. I've also tried setting edns-udp-size 512 in case the problem is some weird EDNS issue with my router (which seems unlikely since the +bufsize=4096 flag works fine). I've also tried dig +trace @localhost play.google.com; this works. No truncation/TCP warning, and a full result. I've also tried changing the servers used in the forwarder (e.g. to OpenDNS), but that makes no difference. There's one last data point: if I repetitively do dig @localhost play.google.com I don't always get an ID mismatch, but sometimes a REFUSED error. I'm much more likely to get a REFUSED error if I dig the non-localhost IP (192.168.0.2) first: $ dig @localhost play.google.com ;; Truncated, retrying in TCP mode. ; <<>> DiG 9.8.1-P1 <<>> @localhost play.google.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 35104 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;play.google.com. IN A ;; Query time: 4 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Thu Oct 17 23:20:13 2013 ;; MSG SIZE rcvd: 33 Any insights or things to try would be much appreciated.

    Read the article

  • UnicodeEncodeError when uploading files in Django admin

    - by Samuel Linde
    Note: I asked this question on StackOverflow, but I realize this might be a more proper place to ask this kind of question. I'm trying to upload a file called 'Testaråäö.txt' via the Django admin app. I'm running Django 1.3.1 with Gunicorn 0.13.4 and Nginx 0.7.6.7 on a Debian 6 server. Database is PostgreSQL 8.4.9. Other Unicode data is saved to the database with no problem, so I guess the problem must be with the filesystem somehow. I've set http { charset utf-8; } in my nginx.conf. LC_ALL and LANG is set to 'sv_SE.UTF-8'. Running 'locale' verifies this. I even tried setting LC_ALL and LANG in my nginx init script just to make sure locale is set properly. Here's the traceback: Traceback (most recent call last): File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/core/handlers/base.py", line 111, in get_response response = callback(request, *callback_args, **callback_kwargs) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/contrib/admin/options.py", line 307, in wrapper return self.admin_site.admin_view(view)(*args, **kwargs) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/utils/decorators.py", line 93, in _wrapped_view response = view_func(request, *args, **kwargs) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/views/decorators/cache.py", line 79, in _wrapped_view_func response = view_func(request, *args, **kwargs) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/contrib/admin/sites.py", line 197, in inner return view(request, *args, **kwargs) File "/srv/django/letebo/app/cms/admin.py", line 81, in change_view return super(PageAdmin, self).change_view(request, obj_id) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/utils/decorators.py", line 28, in _wrapper return bound_func(*args, **kwargs) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/utils/decorators.py", line 93, in _wrapped_view response = view_func(request, *args, **kwargs) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/utils/decorators.py", line 24, in bound_func return func(self, *args2, **kwargs2) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/db/transaction.py", line 217, in inner res = func(*args, **kwargs) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/contrib/admin/options.py", line 985, in change_view self.save_formset(request, form, formset, change=True) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/contrib/admin/options.py", line 677, in save_formset formset.save() File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/forms/models.py", line 482, in save return self.save_existing_objects(commit) + self.save_new_objects(commit) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/forms/models.py", line 613, in save_new_objects self.new_objects.append(self.save_new(form, commit=commit)) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/forms/models.py", line 717, in save_new obj.save() File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/db/models/base.py", line 460, in save self.save_base(using=using, force_insert=force_insert, force_update=force_update) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/db/models/base.py", line 504, in save_base self.save_base(cls=parent, origin=org, using=using) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/db/models/base.py", line 543, in save_base for f in meta.local_fields if not isinstance(f, AutoField)] File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/db/models/fields/files.py", line 255, in pre_save file.save(file.name, file, save=False) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/db/models/fields/files.py", line 92, in save self.name = self.storage.save(name, content) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/core/files/storage.py", line 48, in save name = self.get_available_name(name) File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/core/files/storage.py", line 74, in get_available_name while self.exists(name): File "/srv/.virtualenvs/letebo/lib/python2.6/site-packages/django/core/files/storage.py", line 218, in exists return os.path.exists(self.path(name)) File "/srv/.virtualenvs/letebo/lib/python2.6/genericpath.py", line 18, in exists st = os.stat(path) UnicodeEncodeError: 'ascii' codec can't encode characters in position 52-54: ordinal not in range(128) I tried running Gunicorn with debugging turned on, and the file uploads without any problem at all. I suppose this must mean that the issue is with Nginx. Still beats me where to look, though. Here are the raw response headers from Gunicorn and Nginx, if it makes any sense: Gunicorn: HTTP/1.1 302 FOUND Server: gunicorn/0.13.4 Date: Thu, 09 Feb 2012 14:50:27 GMT Connection: close Transfer-Encoding: chunked Expires: Thu, 09 Feb 2012 14:50:27 GMT Vary: Cookie Last-Modified: Thu, 09 Feb 2012 14:50:27 GMT Location: http://my-server.se:8000/admin/cms/page/15/ Cache-Control: max-age=0 Content-Type: text/html; charset=utf-8 Set-Cookie: messages="yada yada yada"; Path=/ Nginx: HTTP/1.1 500 INTERNAL SERVER ERROR Server: nginx/0.7.67 Date: Thu, 09 Feb 2012 14:50:57 GMT Content-Type: text/html; charset=utf-8 Transfer-Encoding: chunked Connection: close Vary: Cookie 500 UPDATE: Both locale.getpreferredencoding() and sys.getfilesystemencoding() outputs 'UTF-8'. locale.getdefaultlocale() outputs ('sv_SE', 'UTF8'). This seem correct to me, so I'm still not sure why I keep getting these errors.

    Read the article

  • Spotlight Infinite Indexing issue (external data drive)

    - by Manca Weeks
    This is an external drive, formerly a boot drive which is now in use only to access music files (sibelius, audio, midi, live, logic etc.) without transferring the data into a new boot system, partly because of the issue I am about to describe, but mostly because the majority of the data is mainly there for archival purposes. The user is a composer and prominent musician and needs to be able to rehash the data at will. I have tried several things - here is a list: - make complete filesystem clone with antonio diaz's ddrescue - run Disk Warrior on copy, repair whatever errors occurred - wipe out all ACLs on entire drive - set all permissions to the same value - wide open 777 - remove any system data (applications, system files, including hidden files to the best of my knowledge) by selecting only non-system/app data and using Carbon Copy Cloner to put only the data of interest onto a newly formatted drive - transfer data to newly formatted drive folder by folder, resetting the spotlight index in between adding each to observe for issues (interesting here is that no issues occurred except for in Documents folder - when I transferred only the Documents folder to a newly formatted drive on its own - no trouble. It appears almost as thought it may not be the content but the quantity or specific combination of data that results in problems) - use DataRescue to transfer the data to yet another newly formatted drive to expose any missed hidden files Between each of the above steps I stopped Spotlight (search for anything beginning with md in Activity Monitor - All Processes and quitting it), deleted the .Spotlight-V100 directory from the affected drive. Restart Splotlight indexing by adding drive to Spotlight privacy list and removing it. In each case the same issue occurs - Spotlight begins indexing normally (or so it seems), then the index estimated time increases, usually to 4 hours remaining. This is where it gets stuck and continues to predict 4 hours remaining but never finishes. Sometimes I can't eject the drive and have to quit the md.. processes from Activity Monitor to be able to eject the drive without Force Eject. Once I disconnect the drive after the 4 hours remaining situation - if I reattach it, Spotlight forever estimates remaining time and never gets going again. So there it is. It is apparently not a filesystem issue, not a permissions issue and not tied to any particular piece of hardware or protocol (used USB and FW drives). I have tried this on several machines (3 to be precise) and in 10.5.8 and 10.6.5. Simply disabling Spotlight on this volume is not an option because the owner has no clue where things are as the data on the volume dates back to music projects and compositions from 2003 and before. He needs to be able to query for results. Anyone got any ideas? ---update 2-6-11 Since I have not received any responses except the one below which appears to misunderstand my point, I am updating this post hoping to get more responses. I have used the terminal command sudo opensnoop -p PID where PID is the mdworker process ID to try and determine what Spotlight is doing and hopefully find the files it's having trouble with. Here's what happens: After indexing for a few hours, mdworker is gone. It no longer shows up in Activity Monitor under "All Processes" and the Terminal window with the opensnoop result stops moving. I then proceeded to execute the same command on mds to see what it was doing and here's what I get, repeatedly: 501 57 mds 21 / 501 57 mds 21 /Volumes/Sno Leppard 501 57 mds 21 /Volumes/Tiger 501 57 mds 21 /Volumes/Leppard 501 57 mds 21 /Volumes/Disk Warrior 501 57 mds 21 /Volumes/ONM Data These represent all the volumes currently mounted in the system. All except ONM Data, which is the one I am trying to index, are excluded from SPotlight indexing at the moment. The sequence above repeats over and over, with slight variation, sometimes skipping one of the volumes. Questions - what happened to mdworker? What is mds doing? I will let this run until tomorrow morning and throughout the day and monitor for any changes. Any input would be very much appreciated. Even if you're not sure what the ultimate answer is, please alert me to anything you think I may be missing. Hopefully at some point we will figure this out... Thanks, M __final edit__ I finally resolved the issue and here is how I did it. I used the terminal command "sudo opensnoop -p PID" where the PID is the process id of the processes I was monitoring. I was looking at all instances of mds and mdworker running in the system. After the third time through indexing the same data set (see info above), I contacted Apple and got to their highest level of support - they were flabbergasted as well. They advised me to install yet another default 10.6.6 system and try again. The same pattern repeated - mds and mdworker(s) would start indexing and eventually the spotlight icon would say 6 hours remaining and all mdworkers were gone, mds at 90% or so of CPU. But I did finally figure out that the first time mdworker stopped like that, the last file it touched was always in the same folder. I excluded that folder from spotlight search and the rest of the data set indexed within about 2 hours with no strange behavior or failures. I copied that folder to another machine and Spotlight barfed immediately. Exclude that folder and all is well again. I have no clue what is causing this behavior, still, but I did find a functional solution to the problem. Anyone with a similar problem - run opensnoop on all instances of mds and mdworker and wait patiently for wdworker to exit. Look at the last file it touched and exclude the enclosing folder from being indexed. I was able to repeat the issue and solution on 2 different installs and 2 different copies of the data set. Hope this helps. If we find an actual cause of the folder being such a problem (it is called MICHAEL BRECKER RECORD SOLOS and contains almost 1 GB of audio related files - performer, live, SD2 - things like that), I will edit again to let you all know. Thanks for ay attempts to help, M

    Read the article

  • Windows 7 Machine Makes Router Drop -All- Wireless Connections [closed]

    - by Hammer Bro.
    Note: I accidentally originally posted this question over at SuperUser, and I still think the issue is caused by some low-level networking practice of Windows 7, but I think the expertise here would be more apt to figuring it out. Apologies for the cross-post. Some background: My home network consists of my Desktop, a two-month old Windows 7 (x64) machine which is online most frequently (N-spec), as well as three other Windows XP laptops (all G) that only connect every now and then (one for work, one for Netflix, and the other for infrequent regular laptop uses). I used to have a Belkin F5D8236-4 wireless router, and everything worked great. A week ago, however, I found out that the Belkin absolutely in no way would establish a VPN connection, something that has become important for work. So I bought a Netgear WNR3500v2/U/L. The wireless was acting a little sketchy at first for just the Windows 7 machine, but I thought it had something to do with 802.11N and I was in a hurry so I just fished up an ethernet cable and disabled the computer's wireless. It has now become apparent, though, that whenever the Windows 7 machine is connected to the router, all wireless connections become unstable. I was using my work laptop for a solid six hours today with no trouble, having multiple SSH connections open over VPN and streaming internet radio in the background. Then, within two minutes of turning on this Windows 7 box, I had lost all connectivity over the wireless. And I was two feet away from the router. The same sort of thing happens on all of the other laptops -- Netflix can be playing stuff all weekend, but if I come up here and do things on this (W7) computer, the streaming will be dead within ten minutes. So here are my basic observations: If the Windows 7 machine is off, then all connections will have a Signal Strength of Very Good or Excellent and a Speed of 48-54 Mbps for an indefinite amount of time. Shortly after the Windows 7 machine is turned on, all wireless connections will experience a consistent decline in Speed down to 1.0 Mbps, eventually losing their connection entirely. These machines will continue to maintain 70% signal strength, as observed by themselves and router. Once dropped, a wireless connection will have difficulty reconnecting. And, if a connection manages to become established, it will quickly drop off again. The Windows 7 machine itself will continue to function just fine if it's using a wired connection, although it will experience these same issues over the wireless. All of the drivers and firmwares are up to date, and this happened both with the stock Netgear firmware as well as the (current) DD-WRT. What I've tried: Making sure each computer is being assigned a distinct IP. (They are.) Disabling UPnP and Stateful Packet Inspection on the router. Disabling Network Sharing, SSDP Discovery, TCP/IP NetBios Helper and Computer Browser services on the Windows 7 machine. Disabling QoS Packet Scheduler, IPv6, and Link Layer Topology Discovery options on my ethernet controller (leaving only Client for Microsoft Networks, File and Printer Sharing, and IPv4 enabled). What I think: It seems awfully similar to the problems discussed in detail at http://social.msdn.microsoft.com/Forums/en/wsk/thread/1064e397-9d9b-4ae2-bc8e-c8798e591915 (which was both the most relevant and concrete information I could dig up on the internet). I still think that something the Windows 7 IP stack (or just Operating System itself) is doing is giving the router fits. However, I could be wrong, because I have two key differences. One is that most instances of this problem are reported as the entire router dying or restarting, and mine still works just fine over the wired connection. The other is that it's a new router, tested with both the factory firmware and the (I assume) well-maintained DD-WRT project. Even if Windows 7 is still secretly sending IPv6 packets or the TCP Window Scaling implementation that I hear Vista caused some trouble with (even though I've tried my best to disable anything fancy), this router should support those functions. I don't want to get a new or a replacement router unless someone can convince me that this is a defective unit. But the problem seems too specific and predictable by my instincts to be a hardware hiccup. And I don't want to deal with the inevitable problems that always seem to take half a day to resolve when getting a new router, since I'm frantically working (including tomorrow) to complete a project by next week's deadline. Plus, I think in the worst case scenario, I could keep this router connected directly to the modem, disable its wireless entirely, and connect the old Belkin to it directly. That should allow me to still use VPN (although I'll have to plug my work laptop directly into that router), and then maintain wireless connections for all of the other computers. But that feels so wrong to me. Anyone have any ideas what the cause and possible solution could be? Clarifications: The Windows 7 machine is directly connected via an ethernet cable to the router for everything above. But while it is online, all other computers' wireless connections become unusable. It is not an issue of signal strength or interference -- no other devices within scanning range are using Channel 1, and the problem will affect computers that are literally feet away from the router with 95% signal strength.

    Read the article

  • vSphere ESX 5.5 hosts cannot connect to NFS Server

    - by Gerald
    Summary: My problem is I cannot use the QNAP NFS Server as an NFS datastore from my ESX hosts despite the hosts being able to ping it. I'm utilising a vDS with LACP uplinks for all my network traffic (including NFS) and a subnet for each vmkernel adapter. Setup: I'm evaluating vSphere and I've got two vSphere ESX 5.5 hosts (node1 and node2) and each one has 4x NICs. I've teamed them all up using LACP/802.3ad with my switch and then created a distributed switch between the two hosts with each host's LAG as the uplink. All my networking is going through the distributed switch, ideally, I want to take advantage of DRS and the redundancy. I have a domain controller VM ("Central") and vCenter VM ("vCenter") running on node1 (using node1's local datastore) with both hosts attached to the vCenter instance. Both hosts are in a vCenter datacenter and a cluster with HA and DRS currently disabled. I have a QNAP TS-669 Pro (Version 4.0.3) (TS-x69 series is on VMware Storage HCL) which I want to use as the NFS server for my NFS datastore, it has 2x NICs teamed together using 802.3ad with my switch. vmkernel.log: The error from the host's vmkernel.log is not very useful: NFS: 157: Command: (mount) Server: (10.1.2.100) IP: (10.1.2.100) Path: (/VM) Label (datastoreNAS) Options: (None) cpu9:67402)StorageApdHandler: 698: APD Handle 509bc29f-13556457 Created with lock[StorageApd0x411121] cpu10:67402)StorageApdHandler: 745: Freeing APD Handle [509bc29f-13556457] cpu10:67402)StorageApdHandler: 808: APD Handle freed! cpu10:67402)NFS: 168: NFS mount 10.1.2.100:/VM failed: Unable to connect to NFS server. Network Setup: Here is my distributed switch setup (JPG). Here are my networks. 10.1.1.0/24 VM Management (VLAN 11) 10.1.2.0/24 Storage Network (NFS, VLAN 12) 10.1.3.0/24 VM vMotion (VLAN 13) 10.1.4.0/24 VM Fault Tolerance (VLAN 14) 10.2.0.0/24 VM's Network (VLAN 20) vSphere addresses 10.1.1.1 node1 Management 10.1.1.2 node2 Management 10.1.2.1 node1 vmkernel (For NFS) 10.1.2.2 node2 vmkernel (For NFS) etc. Other addresses 10.1.2.100 QNAP TS-669 (NFS Server) 10.2.0.1 Domain Controller (VM on node1) 10.2.0.2 vCenter (VM on node1) I'm using a Cisco SRW2024P Layer-2 switch (Jumboframes enabled) with the following setup: LACP LAG1 for node1 (Ports 1 through 4) setup as VLAN trunk for VLANs 11-14,20 LACP LAG2 for my router (Ports 5 through 8) setup as VLAN trunk for VLANs 11-14,20 LACP LAG3 for node2 (Ports 9 through 12) setup as VLAN trunk for VLANs 11-14,20 LACP LAG4 for the QNAP (Ports 23 and 24) setup to accept untagged traffic into VLAN 12 Each subnet is routable to another, although, connections to the NFS server from vmk1 shouldn't need it. All other traffic (vSphere Web Client, RDP etc.) goes through this setup fine. I tested the QNAP NFS server beforehand using ESX host VMs atop of a VMware Workstation setup with a dedicated physical NIC and it had no problems. The ACL on the NFS Server share is permissive and allows all subnet ranges full access to the share. I can ping the QNAP from node1 vmk1, the adapter that should be used to NFS: ~ # vmkping -I vmk1 10.1.2.100 PING 10.1.2.100 (10.1.2.100): 56 data bytes 64 bytes from 10.1.2.100: icmp_seq=0 ttl=64 time=0.371 ms 64 bytes from 10.1.2.100: icmp_seq=1 ttl=64 time=0.161 ms 64 bytes from 10.1.2.100: icmp_seq=2 ttl=64 time=0.241 ms Netcat does not throw an error: ~ # nc -z 10.1.2.100 2049 Connection to 10.1.2.100 2049 port [tcp/nfs] succeeded! The routing table of node1: ~ # esxcfg-route -l VMkernel Routes: Network Netmask Gateway Interface 10.1.1.0 255.255.255.0 Local Subnet vmk0 10.1.2.0 255.255.255.0 Local Subnet vmk1 10.1.3.0 255.255.255.0 Local Subnet vmk2 10.1.4.0 255.255.255.0 Local Subnet vmk3 default 0.0.0.0 10.1.1.254 vmk0 VM Kernel NIC info ~ # esxcfg-vmknic -l Interface Port Group/DVPort IP Family IP Address Netmask Broadcast MAC Address MTU TSO MSS Enabled Type vmk0 133 IPv4 10.1.1.1 255.255.255.0 10.1.1.255 00:50:56:66:8e:5f 1500 65535 true STATIC vmk0 133 IPv6 fe80::250:56ff:fe66:8e5f 64 00:50:56:66:8e:5f 1500 65535 true STATIC, PREFERRED vmk1 164 IPv4 10.1.2.1 255.255.255.0 10.1.2.255 00:50:56:68:f5:1f 1500 65535 true STATIC vmk1 164 IPv6 fe80::250:56ff:fe68:f51f 64 00:50:56:68:f5:1f 1500 65535 true STATIC, PREFERRED vmk2 196 IPv4 10.1.3.1 255.255.255.0 10.1.3.255 00:50:56:66:18:95 1500 65535 true STATIC vmk2 196 IPv6 fe80::250:56ff:fe66:1895 64 00:50:56:66:18:95 1500 65535 true STATIC, PREFERRED vmk3 228 IPv4 10.1.4.1 255.255.255.0 10.1.4.255 00:50:56:72:e6:ca 1500 65535 true STATIC vmk3 228 IPv6 fe80::250:56ff:fe72:e6ca 64 00:50:56:72:e6:ca 1500 65535 true STATIC, PREFERRED Things I've tried/checked: I'm not using DNS names to connect to the NFS server. Checked MTU. Set to 9000 for vmk1, dvSwitch and Cisco switch and QNAP. Moved QNAP onto VLAN 11 (VM Management, vmk0) and gave it an appropriate address, still had same issue. Changed back afterwards of course. Tried initiating the connection of NAS datastore from vSphere Client (Connected to vCenter or directly to host), vSphere Web Client and the host's ESX Shell. All resulted in the same problem. Tried a path name of "VM", "/VM" and "/share/VM" despite not even having a connection to server. I plugged in a linux system (10.1.2.123) into a switch port configured for VLAN 12 and tried mounting the NFS share 10.1.2.100:/VM, it worked successfully and I had read-write access to it I tried disabling the firewall on the ESX host esxcli network firewall set --enabled false I'm out of ideas on what to try next. The things I'm doing differently from my VMware Workstation setup is the use of LACP with a physical switch and a virtual distributed switch between the two hosts. I'm guessing the vDS is probably the source of my troubles but I don't know how to fix this problem without eliminating it.

    Read the article

  • Cross-platform distributed fault-tolerant (disconnected operation/local cache) filesystem

    - by Adrian Frühwirth
    We are facing a design "challenge" where we are required to set up a storage solution with the following properties: What we need HA a scalable storage backend offline/disconnected operation on the client to account for network outages cross-platform access client-side access from certainly Windows (probably XP upwards), possibly Linux backend integrates with AD/LDAP (permission management (user/group management, ...)) should work reasonably well over slow WAN-links Another problem is that we don't really know all possible use cases here, if people need to be able to have concurrent access to shared files or if they will only be accessing their own files, so a possible solution needs to account for concurrent access and how conflict management would look in this case from a user's point of view. This two years old blog posts sums up the impression that I have been getting during the last couple of days of research, that there are lots of current übercool projects implementing (non-Windows) clustered petabyte-capable blob-storage solutions but that there is none that supports disconnected operation nicely and natively, but I am hoping that we have missed an obvious solution. What we have tried OpenAFS We figured that we want a distributed network filesystem with a local cache and tested OpenAFS (which, as the only currently "stable" DFS supporting disconnected operation, seemed the way to go) for a week but there are several problems with it: it's a real pain to set up there are no official RHEL/CentOS packages the package of the current stable version 1.6.5.1 from elrepo randomly kernel panics on fresh installs, this is an absolute no-go Windows support (including the required Kerberos packages) is mystical. The current client for the 1.6 branch does not run on Windows 8, the current client for the 1.7 does but it just randomly crashes. After that experience we didn't even bother testing on XP and Windows 7. Suffice to say, we couldn't get it working and the whole setup has been so unstable and complicated to setup that it's just not an option for production. Samba + Unison Since OpenAFS was a complete disaster and no other DFS seems to support disconnected operation we went for a simpler idea that would sync files against a Samba server using Unison. This has the following advantages: Samba integrates with ADs; it's a pain but can be done. Samba solves the problem of remotely accessing the storage from Windows but introduces another SPOF and does not address the actual storage problem. We could probably stick any clustered FS underneath Samba, but that means we need a HA Samba setup on top of that to maintain HA which probably adds a lot of additional complexity. I vaguely remember trying to implement redundancy with Samba before and I could not silently failover between servers. Even when online, you are working with local files which will result in more conflicts than would be necessary if a local cache were only touched when disconnected It's not automatic. We cannot expect users to manually sync their files using the (functional, but not-so-pretty) GTK GUI on a regular basis. I attempted to semi-automate the process using the Windows task scheduler, but you cannot really do it in a satisfactory way. On top of that, the way Unison works makes syncing against Samba a costly operation, so I am afraid that it just doesn't scale very well or even at all. Samba + "Offline Files" After that we became a little desparate and gave Windows "offline files" a chance. We figured that having something that is inbuilt into the OS would reduce administrative efforts, helps blaming someone else when it's not working properly and should just work since people have been using this for years. Right? Wrong. We really wanted it to work, but it just doesn't. 30 minutes of copying files around and unplugging network cables/disabling network interfaces left us with (silent! there is only a tiny notification in Windows explorer in the statusbar, which doesn't even open Sync Center if you click on it!) undeletable files on the server (!) and conflicts that should not even be conflicts. In the end, we had one successful sync of a tiny text file, everything else just exploded horribly. Beyond that, there are other problems: Microsoft admits that "offline files" in Windows XP cannot cope with "large files" and therefore does not cache/sync them at all which would mean those files become unavailable if the connection drop In Windows 7 the feature is only available in the Professional/Ultimate/Enterprise editions. Summary Unless there is another fault-tolerant DFS that supports Windows natively I assume that stacking a HA Samba cluster on top of something like GlusterFS/Lustre/whatnot is the only option, but I hope that I am wrong here. How do other companies allow fault-tolerant network access to redundant storage in a heterogeneous environment with Windows?

    Read the article

  • Remote Desktop Services Gateway Issue

    - by AVandelay05
    Alright fellow techies here's the rundown. I have installed Server 2008 r2 Remote Dekstop Services on a VM in my network. I installed the following RD role services: RD Session Host, Licensing, Connection Broker, Gateway, Web Access. When I set things up originally, the gateway server and RDWeb worked as it should locally. After getting things running locally (remoteserver.domainname.local) I wanted to test things externally. From the outside, I couldn't get things running (meaning I could connect to rdweb access externally, but when I tried to run an app I would get the message "can't connect/find computer"). Here's my setup for external access The VM has every RD Services role services installed on it, meaning it acts as gateway, rd web access, session host, licensing, the whole bit. I made a self-signed certificate on the gateway server (gateway.domainname.net is the cert name). Internally, I have a secondary forward-lookup zone called domainname.net with an A record gateway pointing to the local IP of the gateway server. On our public DNS (domainname.net) I have an A record gateway. This is to access the RDWeb externally. In IIS I have the following authentication settings RDWeb: All disabled except for anonymous authentication Rpc: All disabled except for basic and windows RpcWithCert: All disbled except for windows authentication I have the necessary web access config in our sonicwall tz210 (https and rdp, external ip pointing to local ip of rds server) RAP and CAP have the correct user and computer groups, authentication, and allowed devices After all of this, here's what happens accessing externally. I can login correctly to RDWeb Access (I've tried a bogus login, I can't login to it so that's working properly). I see the Apps for use. I click on an app, click connect, the credential window opens, I put in the correct user creds, it tries to connect to the gateway server, but then the cred window comes back in view. I tried to reach a limit of failed logins, but never reached one, haha. So from the same external client machine I try to connect to the gateway through a Remote Desktop connection. I put in the correct gateway settings in the RD window, try to connect and get the same results as I did in RDWeb access. I checked the event logs on the RD Services machine and saw the following event IDs around the time I tried to login externally: ID 6037 with the message "The program svchost.exe, with the assigned process ID 2168, could not authenticate locally by using the target name host/gateway.domainname.net. The target name used is not valid. A target name should refer to one of the local computer names, for example, the DNS host name. Try a different target name." ID 10 RADWebAccess "RD Web Access was unable to access gateway.domainname.net, which is the server that is specified as running the RemoteApp and Desktop Connection Management service. Ensure that the computer account of the RD Web Access server is a member of the TS Web Access Computers security group on gateway.domainname.net" ID 4625 "An account failed to log on. Subject: Security ID: NULL SID Account Name: - Account Domain: - Logon ID: 0x0 Logon Type: 3 Account For Which Logon Failed: Security ID: NULL SID Account Name: Administrator Account Domain: gateway.domainname.net Failure Information: Failure Reason: Unknown user name or bad password. Status: 0xc000006d Sub Status: 0xc000006a Process Information: Caller Process ID: 0x0 Caller Process Name: - Network Information: Workstation Name: USER-LAPTOP Source Network Address: External IP Source Port: 63125 Detailed Authentication Information: Logon Process: NtLmSsp Authentication Package: NTLM Transited Services: - Package Name (NTLM only): - Key Length: 0 This event is generated when a logon request fails. It is generated on the computer where access was attempted. The Subject fields indicate the account on the local system which requested the logon. This is most commonly a service such as the Server service, or a local process such as Winlogon.exe or Services.exe. The Logon Type field indicates the kind of logon that was requested. The most common types are 2 (interactive) and 3 (network). The Process Information fields indicate which account and process on the system requested the logon. The Network Information fields indicate where a remote logon request originated. Workstation name is not always available and may be left blank in some cases. The authentication information fields provide detailed information about this specific logon request. - Transited services indicate which intermediate services have participated in this logon request. - Package name indicates which sub-protocol was used among the NTLM protocols." I don't think the VM has a null SID. The SID of the VM and it's physical host have different SIDS. I can access the blank page for rpc externally using the external gateway name. It seems like authentication is a problem. Also, is it a problem that the external name of the gateway server doesn't match the local name? The external name (which the cert is based on) is gateway.domainname.net and the internal name is remoteserver.domainname.local. That's the only thing I can think of that would be the problem, but the external name has to be different from the local right? Internally, I ping gateway.domainname.net and it gives me the correct local IP of the server. Now, there isn't an actual computer name in AD, but I don't know how I would achieve that? I hope I've been clear....any help would be appreciated. I think I'm close to achieving this. :)

    Read the article

  • VFS: file-max limit 1231582 reached

    - by Rick Koshi
    I'm running a Linux 2.6.36 kernel, and I'm seeing some random errors. Things like ls: error while loading shared libraries: libpthread.so.0: cannot open shared object file: Error 23 Yes, my system can't consistently run an 'ls' command. :( I note several errors in my dmesg output: # dmesg | tail [2808967.543203] EXT4-fs (sda3): re-mounted. Opts: (null) [2837776.220605] xv[14450] general protection ip:7f20c20c6ac6 sp:7fff3641b368 error:0 in libpng14.so.14.4.0[7f20c20a9000+29000] [4931344.685302] EXT4-fs (md16): re-mounted. Opts: (null) [4982666.631444] VFS: file-max limit 1231582 reached [4982666.764240] VFS: file-max limit 1231582 reached [4982767.360574] VFS: file-max limit 1231582 reached [4982901.904628] VFS: file-max limit 1231582 reached [4982964.930556] VFS: file-max limit 1231582 reached [4982966.352170] VFS: file-max limit 1231582 reached [4982966.649195] top[31095]: segfault at 14 ip 00007fd6ace42700 sp 00007fff20746530 error 6 in libproc-3.2.8.so[7fd6ace3b000+e000] Obviously, the file-max errors look suspicious, being clustered together and recent. # cat /proc/sys/fs/file-max 1231582 # cat /proc/sys/fs/file-nr 1231712 0 1231582 That also looks a bit odd to me, but the thing is, there's no way I have 1.2 million files open on this system. I'm the only one using it, and it's not visible to anyone outside the local network. # lsof | wc 16046 148253 1882901 # ps -ef | wc 574 6104 44260 I saw some documentation saying: file-max & file-nr: The kernel allocates file handles dynamically, but as yet it doesn't free them again. The value in file-max denotes the maximum number of file- handles that the Linux kernel will allocate. When you get lots of error messages about running out of file handles, you might want to increase this limit. Historically, the three values in file-nr denoted the number of allocated file handles, the number of allocated but unused file handles, and the maximum number of file handles. Linux 2.6 always reports 0 as the number of free file handles -- this is not an error, it just means that the number of allocated file handles exactly matches the number of used file handles. Attempts to allocate more file descriptors than file-max are reported with printk, look for "VFS: file-max limit reached". My first reading of this is that the kernel basically has a built-in file descriptor leak, but I find that very hard to believe. It would imply that any system in active use needs to be rebooted every so often to free up the file descriptors. As I said, I can't believe this would be true, since it's normal to me to have Linux systems stay up for months (even years) at a time. On the other hand, I also can't believe that my nearly-idle system is holding over a million files open. Does anyone have any ideas, either for fixes or further diagnosis? I could, of course, just reboot the system, but I don't want this to be a recurring problem every few weeks. As a stopgap measure, I've quit Firefox, which was accounting for almost 2000 lines of lsof output (!) even though I only had one window open, and now I can run 'ls' again, but I doubt that will fix the problem for long. (edit: Oops, spoke too soon. By the time I finished typing out this question, the symptom was/is back) Thanks in advance for any help. And another update: My system was basically unusable, so I decided I had no option but to reboot. But before I did, I carefully quit one process at a time, checking /proc/sys/fs/file-nr after each termination. I found that, predictably, the number of open files gradually went down as I closed things down. Unfortunately, it wasn't a large effect. Yes, I was able to clear up 5000-10000 open files, but there were still over 1.2 million left. I shut down just about everything. All interactive shells, except for the one ssh I left open to finish closing down, httpd, even nfs service. Basically everything in the process table that wasn't a kernel process, and there were still an appalling number of files apparently left open. After the reboot, I found that /proc/sys/fs/file-nr showed about 2000 files open, which is much more reasonable. Starting up 2 Xvnc sessions as usual, along with the dozen or so monitoring windows I like to keep open, brought the total up to about 4000 files. I can see nothing wrong with that, of course, but I've obviously failed to identify the root cause. I'm still looking for ideas, since I definitely expect it to happen again. And another update, the next day: I watched the system carefully, and discovered that /proc/sys/fs/file-nr showed a growth of about 900 open files per hour. I shut down the system's only NFS client for the night, and the growth stopped. Mind you, it didn't free up the resources, but it did at least stop consuming more. Is this a known bug with NFS? I'll be bringing the NFS client back online today, and I'll narrow it down further. If anyone is familiar with this behavior, feel free to jump in with "Yeah, NFS4 has this problem, go back to NFS3" or something like that.

    Read the article

  • CISCO 2911 Router configuration

    - by bala
    Device cisco 2911 router configuration support is required please. I have exchange server 2010 configured and working without any errors the problem is in cisco router configuration when exchange server sends emails out the receives WAN IP not the public ip. I have configured RDNS lookups with our MX record IP addesses that match the FQDN but all our emails are rejected because it does not match with the public ip. Receiving mails problem is not an problem all mails are coming through. i am sure i am missing something on the router configuration that does not sends the public ip, can any one help me to solve this issue. Note; I've got 1 WAN IP & 8 Public IP from ISP . Find below the running configuration. Building configuration... Current configuration : 2734 bytes ! ! Last configuration change at 06:32:13 UTC Tue Apr 3 2012 ! NVRAM config last updated at 06:32:14 UTC Tue Apr 3 2012 ! NVRAM config last updated at 06:32:14 UTC Tue Apr 3 2012 version 15.1 service timestamps debug datetime msec service timestamps log datetime msec service password-encryption ! hostname BSBG-LL ! boot-start-marker boot-end-marker ! ! enable secret 5 $x$xHrxxxxx5ox0 enable password 7 xx23xx5FxxE1xx044 ! no aaa new-model ! no ipv6 cef ip source-route ip cef ! ! ! ! ! ip flow-cache timeout active 1 ip domain name yourdomain.com ip name-server 213.42.20.20 ip name-server 195.229.241.222 multilink bundle-name authenticated ! ! crypto pki token default removal timeout 0 ! ! license udi pid CISCO2911/K9 ! ! username bsbg ! ! ! ! ! ! interface Embedded-Service-Engine0/0 no ip address shutdown ! interface GigabitEthernet0/0 ip address 192.168.0.9 255.255.255.0 ip flow ingress ip nat inside ip virtual-reassembly in duplex auto speed 100 no cdp enable ! interface GigabitEthernet0/1 ip address 213.42.xx.x2 255.255.255.252 ip nat outside ip virtual-reassembly in duplex auto speed auto no cdp enable ! interface GigabitEthernet0/2 no ip address shutdown duplex auto speed auto ! ip forward-protocol nd ! no ip http server no ip http secure-server ! ip nat inside source list 120 interface GigabitEthernet0/1 overload ip nat inside source static tcp 192.168.0.4 25 94.56.89.100 25 extendable ip nat inside source static tcp 192.168.0.4 53 94.56.89.100 53 extendable ip nat inside source static udp 192.168.0.4 53 94.56.89.100 53 extendable ip nat inside source static tcp 192.168.0.4 110 94.56.89.100 110 extendable ip nat inside source static tcp 192.168.0.4 443 94.56.89.100 443 extendable ip nat inside source static tcp 192.168.0.4 587 94.56.89.100 587 extendable ip nat inside source static tcp 192.168.0.4 995 94.56.89.100 995 extendable ip nat inside source static tcp 192.168.0.4 3389 94.56.89.100 3389 extendable ip nat inside source static tcp 192.168.0.4 443 94.56.89.101 443 extendable ip nat inside source static tcp 192.168.0.12 80 94.56.89.102 80 extendable ip nat inside source static tcp 192.168.0.12 443 94.56.89.102 443 extendable ip nat inside source static tcp 192.168.0.12 3389 94.56.89.102 3389 extendable ip route 0.0.0.0 0.0.0.0 213.42.69.41 ! access-list 120 permit ip 192.168.0.0 0.0.0.255 any ! ! ! control-plane ! ! ! line con 0 exec-timeout 5 0 line aux 0 line 2 no activation-character no exec transport preferred none transport input all transport output pad telnet rlogin lapb-ta mop udptn v120 ssh stopbits 1 line vty 0 4 password 7 xx64xxD530D26086Dxx login transport input all ! scheduler allocate 20000 1000 end

    Read the article

  • Deployment Workbench no longer available after PXE boot

    - by Patrick
    Our build process revolves around windows Deployment Workbench. Unfortunately this was setup by someone who is no longer with the company, and no-one has ever dared/needed to make any changes. The other day it stopped working. It turns out that one of our build guys started thinking about changing some stuff in it, clicked something and now it no longer works (He is saying now that he right clicked on the 'LAB' entry in 'Deployment points' and hit 'Update', which took some time to run through apparently). The job has fallen on me to resolve and frankly I'm not sure what I'm doing. I was wondering if someone with more experience than me can provide some pointers as to troubleshooting cos I'm feeling quite a lot in the dark here. On the server I have Deployment Workbench up and running (MMC snapin) version 3.0. There is a WDS service that appears to be running ok, as does the tFTPd service. Nothing specific to this in event logs. From the client side; PXE boot works and gets you to the Win PE launch, and it has the correct company logo as the background (proving to me that its loading win PE from the network). WPEINIT runs, and asks for domain credentials, here the team simply put User/Pass/Domain in the boxes and click ok. Normally the build would kick off. Instead they get an error message saying that the \NATBLU01\Distribution$ share isn't available. Checking \NATBLU01\Distribution$ shows that its there and accessible over the network. Security/permissions seem ok, even 'ANONYMOUS LOGON' has read access to that share so I don't see that being a problem. Digging the trace files from C:\MININT\SMSOSD\OSDLOGS\ after an attempt to run the build I can see an error saying much the same - <![LOG[Validating connection to \\NATBLU01\Distribution$]LOG]!><time="16:42:14.000+000" date="03-15-2012" component="LiteTouch" context="" type="1" thread="" file="LiteTouch"> <![LOG[FindFile: The file OSDConnectToUNC.exe could not be found in any standard locations.]LOG]!><time="16:42:14.000+000" date="03-15-2012" component="LiteTouch" context="" type="1" thread="" file="LiteTouch"> <![LOG[The network location cannot be reached. For information about network troubleshooting, see Windows Help.]LOG]!><time="16:42:24.000+000" date="03-15-2012" component="LiteTouch" context="" type="3" thread="" file="LiteTouch"> <![LOG[ERROR - Unable to map a network drive to \\NATBLU01\Distribution$.]LOG]!><time="16:42:24.000+000" date="03-15-2012" component="LiteTouch" context="" type="3" thread="" file="LiteTouch"> BDD.LOG shows much the same. Full copies of the .LOG files can both files be found here : BDD.LOG LITETOUCH.LOG I can get to a command prompt from the Win PE that boots from PXE, however there isn't any network stuff there. IPCONFIG returns nothing so none of the tests I would usually run resolve anything. I'm at a loss frankly. I did wonder if I could perhaps start a new build process but if the change to the DeploymentWorkbench has knocked it offline I don't think I'm going to be able to create a new deployment. Failing that; we do have a deployment point labeled type 'Media' which appears to be a DVD ISO image of one of the builds, but its dated 2008, is it possible to export the network build to .ISO and build from DVD? We are looking at new hardware to run this from anyway (for the impending Windows 7 roll out) so a temporary work round isn't going to be too much of a problem. All assistance is appreciated! EDIT : OK. Got it working again. Solution was close to Newmanth's idea. The problem was that our PE image didn't appear to be connecting the network. I had an older copy of the PE boot.WIM on a stick that I had been using for other purposes. I booted that and correctly got a network connection. Showed a correct internal IP and could ping out etc etc. However I was still getting the same errors in all the logs and in when wpeinit was running. What I did seperately was to update the PE image that DeploymentWorkbench was pushing out to display a different back ground. I wanted to prove that I was working in the correct place. Turns out that I wasn't. I went and looked at the other deployment stuff we had on this machine, Windows Deployment Services was installed and although all the install images are off line the boot image was online, so I uploaded the copy from my stick to that. Booted straight off. And fixed. Working. Yay! For anyone stumbling across this in the future you may find that although your deployment images are located in the DeploymentWorkbench, the Win PE boot image you are launching from is located in the associated Windows Deployment Services images.

    Read the article

< Previous Page | 756 757 758 759 760 761 762 763 764 765 766 767  | Next Page >