Search Results

Search found 294 results on 12 pages for 'algorithmic trading'.

Page 9/12 | < Previous Page | 5 6 7 8 9 10 11 12 | Next Page >

How to avoid becoming a programmer while still beign closely involved with computer science/Industry

- by WeShallOvercome

I am studying computer science (A masters at an Ivy league), however most of the jobs i find involve way too much of programming. And frankly programming is not an issue, however programming without a meaning (read financial institution (non trading), other non mainstream jobs) bore me to death! I dont want to end up becoming a .NET,C#, Java kind of programmer. Can someone tell where i should look for jobs if i wish to do some real computer science work such as Machine Learning etc. I don't mind programming but becoming a Financial Software dev at Bloomeberg or an SDET at Microsoft isn't actually one of my goals. [note: I have interviewed for intern both positions listed above, and thankfully i got an intern for a data mining position in a top 750 Alexa rank web company] Sorry if angered anyone with a subjective question

Read the article
Using a message class static method taking in an action to wrap Try/Catch

- by Chris Marisic

I have a Result object that lets me pass around a List of event messages and I can check whether an action was successful or not. I've realized I've written this code in alot of places Result result; try { //Do Something ... //New result is automatically a success for not having any errors in it result = new Result(); } catch (Exception exception) { //Extension method that returns a Result from the exception result = exception.ToResult(); } if(result.Success) .... What I'm considering is replacing this usage with public static Result CatchException(Action action) { try { action(); return new Result(); } catch (Exception exception) { return exception.ToResult(); } } And then use it like var result = Result.CatchException(() => _model.Save(something)); Does anyone feel there's anything wrong with this or that I'm trading reusability for obscurity?

Read the article
Slides of my HOL on MySQL Cluster

- by user13819847

Hi!Thanks everyone who attended my hands-on lab on MySQL Cluster at MySQL Connect last Saturday.The following are the links for the slides, the HOL instructions, and the code examples.I'll try to summarize my HOL below.Aim of the HOL was to help attendees to familiarize with MySQL Cluster. In particular, by learning: the basics of MySQL Cluster Architecture the basics of MySQL Cluster Configuration and Administration how to start a new Cluster for evaluation purposes and how to connect to it We started by introducing MySQL Cluster. MySQL Cluster is a proven technology that today is successfully servicing the most performance-intensive workloads. MySQL Cluster is deployed across telecom networks and is powering mission-critical web applications. Without trading off use of commodity hardware, transactional consistency and use of complex queries, MySQL Cluster provides: Web Scalability (web-scale performance on both reads and writes) Carrier Grade Availability (99.999%) Developer Agility (freedom to use SQL or NoSQL access methods) MySQL Cluster implements: an Auto-Sharding, Multi-Master, Shared-nothing Architecture, where independent nodes can scale horizontally on commodity hardware with no shared disks, no shared memory, no single point of failure In the architecture of MySQL Cluster it is possible to find three types of nodes: management nodes: responsible for reading the configuration files, maintaining logs, and providing an interface to the administration of the entire cluster data nodes: where data and indexes are stored api nodes: provide the external connectivity (e.g. the NDB engine of the MySQL Server, APIs, Connectors) MySQL Cluster is recommended in the situations where: it is crucial to reduce service downtime, because this produces a heavy impact on business sharding the database to scale write performance higly impacts development of application (in MySQL Cluster the sharding is automatic and transparent to the application) there are real time needs there are unpredictable scalability demands it is important to have data-access flexibility (SQL & NoSQL) MySQL Cluster is available in two Editions: Community Edition (Open Source, freely downloadable from mysql.com) Carrier Grade Edition (Commercial Edition, can be downloaded from eDelivery for evaluation purposes) MySQL Carrier Grade Edition adds on the top of the Community Edition: Commercial Extensions (MySQL Cluster Manager, MySQL Enterprise Monitor, MySQL Cluster Installer) Oracle's Premium Support Services (largest team of MySQL experts backed by MySQL developers, forward compatible hot fixes, multi-language support, and more) We concluded talking about the MySQL Cluster vision: MySQL Cluster is the default database for anyone deploying rapidly evolving, realtime transactional services at web-scale, where downtime is simply not an option. From a practical point of view the HOL's steps were: MySQL Cluster installation start & monitoring of the MySQL Cluster processes client connection to the Management Server and to an SQL Node connection using the NoSQL NDB API and the Connector J In the hope that this blog post can help you get started with MySQL Cluster, I take the opportunity to thank you for the questions you made both during the HOL and at the MySQL Cluster booth. Slides are also on SlideShares: Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster Happy Clustering!

Read the article
Victor Grazi, Java Champion!

- by Tori Wieldt

Congratulations to Victor Grazi, who has been made a Java Champion! He was nominated by his peers and selected as a Java Champion for his experience as a developer, and his work in the Java and Open Source communities. Grazi is a Java evangelist and serves on the Executive Committee of the Java Community Process, representing Credit Suisse - the first non-technology vendor on the JCP. He also arranges the NY Java SIG meetings at Credit Suisse's New York campus each month, and he says it has been a valuable networking opportunity. He also is the spec lead for JSR 354, the Java Money and Currency API. Grazi has been building real time financial systems in Java since JDK version 1.02! In 1996, the internet was just starting to happen, Grazi started a dot com called Supermarkets to Go, that provided an on-line shopping presence to supermarkets and grocers. Grazi wrote most of the code, which was a great opportunity for him to learn Java and UI development, as well as database management. Next, he went to work at Bank of NY building a trading system. He studied for Java certification, and he noted that getting his certification was a game changer because it helped him started to learn the nuances of the Java language. He has held other development positions, "You may have noticed that you don't get as much junk mail from Citibank as you used to - that is thanks to one of my projects!" he told us. Grazi joined Credit Suisse in 2005 and is currently Vice President on the central architecture team. Grazi is proud of his open source project, Java Concurrent Animated, a series of animations that visualize the functionality of the components in the java.util.concurrent library. "It has afforded me the opportunity to speak around the globe" and because of it, has discovered that he really enjoys doing public presentations. He is a fine addition to the Java Champions program. The Java Champions are an exclusive group of passionate Java technology and community leaders who are community-nominated and selected under a project sponsored by Oracle. Nominees are named and selected through a peer review process. Java Champions get the opportunity to provide feedback, ideas, and direction that will help Oracle grow the Java Platform. This interchange may be in the form of technical discussions and/or community-building activities with Oracle's Java Development and Developer Program teams.

Read the article
BPEL 11.1.1.6 Certified for Prebuilt E-Business Suite 12.1.3 SOA Integrations

- by Steven Chan (Oracle Development)

Service Oriented Architecture (SOA) integrations with Oracle E-Business Suite can either be custom integrations that you build yourself or prebuilt integrations from Oracle. For more information about the differences between the two options for SOA integrations, see this previously-published certification announcement. There are five prebuilt BPEL business processes by Oracle E-Business Suite Release 12 product teams: Oracle Price Protection (DPP) Complex Maintenance, Repair & Overhaul (CMRO/AHL) Oracle Transportation Management (WMS, WSH, PO) Advanced Supply Chain Planning (MSC) Product Information Management (PIM/EGO) Last year we announced the certification of BPEL 11.1.1.5 for Prebuilt E-Business Suite 12.1.3 SOA integrations. The five prebuilt BPEL processes have now been certified with Oracle BPEL Process Manager 11g version 11.1.1.6 (in Oracle Fusion Middleware SOA Suite 11g). These prebuilt BPEL processes are certified with Oracle E-Business Suite Release 12.1.3 and higher. Note: The Supply Chain Trading Connector (CLN) product team has opted not to support BPEL 11g with their prebuilt business processes previously certified with BPEL 10.1.3.5. If you have a requirement for that certification, I would recommend contacting your Oracle account manager to ensure that the Supply Chain team is notified appropriately. For additional information about prebuilt integrations with Oracle E-Business Suite Release 12.1.3, please refer to the following documentation: Integrating Oracle E-Business Suite Release 12 with Oracle BPEL available in Oracle SOA Suite 11g (Note 1321776.1) Oracle Fusion Middleware 11g (11.1.1.6.0) Documentation Library Installing Oracle SOA Suite and Oracle Business Process Management Suite Release Notes for Oracle Fusion Middleware 11g (11.1.1.6) Certified Platforms Linux x86 (Oracle Linux 4, 5) Linux x86 (RHEL 5) Linux x86 (SLES 10) Linux x86-64 (Oracle Linux 4, 5, 6) Linux x86-64 (RHEL 5) Linux x86-64 (SLES 10) Oracle Solaris on SPARC (64-bit) (9, 10, 11) HP-UX Itanium (11.23, 11.31) HP-UX PA-RISC (64-bit) (11.23, 11.31) IBM AIX on Power Systems (64-bit) (5.3, 6.1, 7) IBM: Linux on System z (RHEL 5, SLES 10) Microsoft Windows Server (32-bit) (2003, 2008) Microsoft Windows x64 (64-bit) (2008 R2) Getting SupportIf you need support for the prebuilt EBS 12.1.3 BPEL business processes, you can log Service Requests against the Applications Technology Group product family. Related Articles BPEL 11.1.1.5 Certified for Prebuilt E-Business Suite 12.1.3 SOA Integrations Webcast Replay Available: SOA Integration Options for E-Business Suite Securing E-Business Suite Web Services with Integrated SOA Gateway

Read the article
Stop trying to be perfect

- by Kyle Burns

Yes, Bob is my uncle too. I also think the points in the Manifesto for Software Craftsmanship (manifesto.softwarecraftsmanship.org) are all great. What amazes me is that tend to confuse the term “well crafted” with “perfect”. I'm about to say something that will make Quality Assurance managers and many development types as well until you think about it as a craftsman – “Stop trying to be perfect”. Now let me explain what I mean. Building software, as with building almost anything, often involves a series of trade-offs where either one undesired characteristic is accepted as necessary to achieve another desired one (or maybe stave off one that is even less desirable) or a desirable characteristic is sacrificed for the same reasons. This implies that perfection itself is unattainable. What is attainable is “sufficient” and I think that this really goes to the heart both of what people are trying to do with Agile and with the craftsmanship movement. Simply put, sufficient software drives the greatest business value. I've been in many meetings where “how can we keep anything from ever going wrong” has become the thing that holds us in analysis paralysis. I've also been the guy trying way too hard to perfect some function to make sure that every edge case is accounted for. Somewhere in there, something a drill instructor said while I was in boot camp occurred to me. In response to being asked a question by another recruit having to do with some edge case (I can barely remember the context), he said “What if grasshoppers had machine guns? Would the birds still **** with them?” It sounds funny, but there's a lot of wisdom in those words. “Sufficient” is different for every situation and it’s important to understand what sufficient means in the context of the work you’re doing. If I’m writing a timesheet application (and please shoot me if I am), I’m going to have a much higher tolerance for imperfection than if you’re writing software to control life support systems on spacecraft. I’m also likely to have less need for high volume performance than if you’re writing software to control stock trading transactions. I’d encourage anyone who has read this far to instead of trying to be perfect, try to create software that is sufficient in every way. If you’re working to make a component that is sufficient “better”, ask yourself if there is any component left that is not yet sufficient. If the answer is “yes” you’re working on the wrong thing and need to adjust. If the answer is “no”, why aren’t you shipping and delivering business value?

Read the article
Numerically stable(ish) method of getting Y-intercept of mouse position?

- by Fraser

I'm trying to unproject the mouse position to get the position on the X-Z plane of a ray cast from the mouse. The camera is fully controllable by the user. Right now, the algorithm I'm using is... Unproject the mouse into the camera to get the ray: Vector3 p1 = Vector3.Unproject(new Vector3(x, y, 0), 0, 0, width, height, nearPlane, farPlane, viewProj; Vector3 p2 = Vector3.Unproject(new Vector3(x, y, 1), 0, 0, width, height, nearPlane, farPlane, viewProj); Vector3 dir = p2 - p1; dir.Normalize(); Ray ray = Ray(p1, dir); Then get the Y-intercept by using algebra: float t = -ray.Position.Y / ray.Direction.Y; Vector3 p = ray.Position + t * ray.Direction; The problem is that the projected position is "jumpy". As I make small adjustments to the mouse position, the projected point moves in strange ways. For example, if I move the mouse one pixel up, it will sometimes move the projected position down, but when I move it a second pixel, the project position will jump back to the mouse's location. The projected location is always close to where it should be, but it does not smoothly follow a moving mouse. The problem intensifies as I zoom the camera out. I believe the problem is caused by numeric instability. I can make minor improvements to this by doing some computations at double precision, and possibly abusing the fact that floating point calculations are done at 80-bit precision on x86, however before I start micro-optimizing this and getting deep into how the CLR handles floating point, I was wondering if there's an algorithmic change I can do to improve this? EDIT: A little snooping around in .NET Reflector on SlimDX.dll: public static Vector3 Unproject(Vector3 vector, float x, float y, float width, float height, float minZ, float maxZ, Matrix worldViewProjection) { Vector3 coordinate = new Vector3(); Matrix result = new Matrix(); Matrix.Invert(ref worldViewProjection, out result); coordinate.X = (float) ((((vector.X - x) / ((double) width)) * 2.0) - 1.0); coordinate.Y = (float) -((((vector.Y - y) / ((double) height)) * 2.0) - 1.0); coordinate.Z = (vector.Z - minZ) / (maxZ - minZ); TransformCoordinate(ref coordinate, ref result, out coordinate); return coordinate; } // ... public static void TransformCoordinate(ref Vector3 coordinate, ref Matrix transformation, out Vector3 result) { Vector3 vector; Vector4 vector2 = new Vector4 { X = (((coordinate.Y * transformation.M21) + (coordinate.X * transformation.M11)) + (coordinate.Z * transformation.M31)) + transformation.M41, Y = (((coordinate.Y * transformation.M22) + (coordinate.X * transformation.M12)) + (coordinate.Z * transformation.M32)) + transformation.M42, Z = (((coordinate.Y * transformation.M23) + (coordinate.X * transformation.M13)) + (coordinate.Z * transformation.M33)) + transformation.M43 }; float num = (float) (1.0 / ((((transformation.M24 * coordinate.Y) + (transformation.M14 * coordinate.X)) + (coordinate.Z * transformation.M34)) + transformation.M44)); vector2.W = num; vector.X = vector2.X * num; vector.Y = vector2.Y * num; vector.Z = vector2.Z * num; result = vector; } ...which seems to be a pretty standard method of unprojecting a point from a projection matrix, however this serves to introduce another point of possible instability. Still, I'd like to stick with the SlimDX Unproject routine rather than writing my own unless it's really necessary.

Read the article
Best Practices for High Volume CPA Import Operations with ebXML in B2B 11g

- by Shub Lahiri, A-Team

Background B2B 11g supports ebXML messaging protocol, where multiple CPAs can be imported via command-line utilities. This note highlights one aspect of the best practices for import of CPA, when large numbers of CPAs in the excess of several hundreds are required to be maintained within the B2B repository. Symptoms The import of CPA usually is a 2-step process, namely creating a soa.zip file using b2bcpaimport utility based on a CPA properties file and then using b2bimport to import the b2b repository. The commands are provided below: ant -f ant-b2b-util.xml b2bcpaimport -Dpropfile="<Path to cpp_cpa.properties>" -Dstandard=true ant -f ant-b2b-util.xml b2bimport -Dlocalfile=true -Dexportfile="<Path to soa.zip>" -Doverwrite=true Usually the first command completes fairly quickly regardless of the number of CPAs in the repository. However, as the number of trading partners within the repository goes up, the time to complete the second command could go up to ~30 secs per operation. So, this could add up to a significant amount, if there is a need to import hundreds of CPA in a production system within a limited downtime, maintenance window. Remedy In situations, where there is a large number of entries to be imported, it is best to setup a staging environment and go through the import operation of each individual CPA in an empty repository. Since, this will be done in an empty repository, the time taken for completion should be reasonable. After all the partner profiles have been imported, a full repository export can be taken to capture the metadata for all the entries in one file. If this single file with all the partner entries is imported in a loaded repository, the total time taken for import of all the CPAs should see a dramatic reduction. Results Let us take a look at the numbers to see the benefit of this approach. With a pre-loaded repository of ~400 partners, the individual import time for each entry takes ~30 secs. So, if we had to import another 100 partners, the individual entries will take ~50 minutes (100 times ~30 secs). On the other hand, if we prepare the repository export file of the same 100 partners from a staging environment earlier, the import takes about ~5 mins. The total processing time for the loading of metadata, specially in a production environment, can thus be shortened by almost a factor of 10. Summary The following diagram summarizes the entire approach and process. Acknowledgements The material posted here has been compiled with the help from B2B Engineering and Product Management teams.

Read the article
linebreak in url with Bibtex and hyperref package

- by Tim

Why is this item not shown properly in my bibliography? @misc{ann, abstract = {ANN is an implbmentation of nearest neighbor search.}, author = {David M. Mount and Sunil Arya}, howpublished = {\url{http://www.cs.umd.edu/~mount/ANN/}}, keywords = {knn}, posted-at = {2010-04-08 00:05:04}, priority = {2}, title = {ANN.}, url = "http://www.cs.umd.edu/~mount/ANN/", year = {2008} } @misc{Nilsson96introductionto, author = {Nilsson, Nils J.}, citeulike-article-id = {6995464}, howpublished = {\url{http://robotics.stanford.edu/people/nilsson/mlbook.html}}, keywords = {*file-import-10-04-11}, posted-at = {2010-04-11 06:52:28}, priority = {2}, title = {Introduction to Machine Learning: An Early Draft of a Proposed Textbook.}, year = {1996} } EDIT: I am using \usepackage{hyperref}, not \usepackage{url}. I don't know what changes I just made made the first item appear properly now @misc{ann, abstract = {ANN is an implbmentation of nearest neighbor search.}, author = {David M. Mount and Sunil Arya}, howpublished = {\url{http://www.cs.umd.edu/~mount/ANN/}}, keywords = {ann}, posted-at = {2010-04-08 00:05:04}, priority = {2}, title = {The \textsc{A}pproximate \textsc{N}earest \textsc{N}eighbor \textsc{S}earching \textsc{L}ibrary.}, url = "http://www.cs.umd.edu/~mount/ANN/", year = {2008} } EDIT: Since I am using hyperref package, it produces error when using url package together with it. So the two cannot work together? I would like to use hyper links inside pdf file, so I would like to use hyperref package instead of url package. I googled a bit, and try \usepackage[hyperindex,breaklinks]{hyperref}, but there is still no line break just as before. How can I do it? Is there conflict in the packages that I am now using?: \usepackage{amsmath} \usepackage{amsfonts} \usepackage[dvips]{graphicx} \usepackage{wrapfig} \graphicspath{{./figs/}} \DeclareGraphicsExtensions{.eps} \usepackage{fixltx2e} \usepackage{array} \usepackage{times} \usepackage{fancyhdr} \usepackage{multirow} \usepackage{algorithmic} \usepackage{algorithm} \usepackage{slashbox} \usepackage{multirow} \usepackage{rotating} \usepackage{longtable} \usepackage[hyperindex,breaklinks]{hyperref} \usepackage{forloop} \usepackage{lscape} \usepackage{supertabular} \usepackage{amssymb} \usepackage{amsthm}

Read the article
Python coding test problem for interviews

- by Kal

I'm trying to come up with a good coding problem to ask interview candidates to solve with Python. They'll have an hour to work on the problem, with an IDE and access to documentation (we don't care what people have memorized). I'm not looking for a tough algorithmic problem - there are other sections of the interview where we do that kind of thing. The point of this section is to sit and watch them actually write code. So it should be something that makes them use just the data structures which are the everyday tools of the application developer - lists, hashtables (dictionaries in Python), etc, to solve a quasi-realistic task. They shouldn't be blocked completely if they can't think of something really clever. We have a problem which we use for Java coding tests, which involves reading a file and doing a little processing on the contents. It works well with candidates who are familiar with Java (or even C++). But we're running into a number of candidates who just don't know Java or C++ or C# or anything like that, but do know Python or Ruby. Which shouldn't exclude them, but leaves us with a dilemma: On the one hand, we don't learn much from watching someone struggle with the basics of a totally unfamiliar language. On the other hand, the problem we use for Java turns out to be pretty trivial in Python (or Ruby, etc) - anyone halfway competent can do it in 15 minutes. So, I'm trying to come up with something better. Surprisingly, Google doesn't show me anyone doing something like this, unless I'm just too dumb to enter the obvious search term. The best idea I've come up with involves scheduling workers to time slots, but it's maybe a little too open-ended. Have you run into a good example? Or a bad one? Or do you just have an idea?

Read the article
Concurrent cartesian product algorithm in Clojure

- by jqno

Is there a good algorithm to calculate the cartesian product of three seqs concurrently in Clojure? I'm working on a small hobby project in Clojure, mainly as a means to learn the language, and its concurrency features. In my project, I need to calculate the cartesian product of three seqs (and do something with the results). I found the cartesian-product function in clojure.contrib.combinatorics, which works pretty well. However, the calculation of the cartesian product turns out to be the bottleneck of the program. Therefore, I'd like to perform the calculation concurrently. Now, for the map function, there's a convenient pmap alternative that magically makes the thing concurrent. Which is cool :). Unfortunately, such a thing doesn't exist for cartesian-product. I've looked at the source code, but I can't find an easy way to make it concurrent myself. Also, I've tried to implement an algorithm myself using map, but I guess my algorithmic skills aren't what they used to be. I managed to come up with something ugly for two seqs, but three was definitely a bridge too far. So, does anyone know of an algorithm that's already concurrent, or one that I can parallelize myself?

Read the article
Worse is better. Is there an example?

- by J.F. Sebastian

Is there a widely-used algorithm that has time complexity worse than that of another known algorithm but it is a better choice in all practical situations (worse complexity but better otherwise)? An acceptable answer might be in a form: There are algorithms A and B that have O(N**2) and O(N) time complexity correspondingly, but B has such a big constant that it has no advantages over A for inputs less then a number of atoms in the Universe. Examples highlights from the answers: Simplex algorithm -- worst-case is exponential time -- vs. known polynomial-time algorithms for convex optimization problems. A naive median of medians algorithm -- worst-case O(N**2) vs. known O(N) algorithm. Backtracking regex engines -- worst-case exponential vs. O(N) Thompson NFA -based engines. All these examples exploit worst-case vs. average scenarios. Are there examples that do not rely on the difference between the worst case vs. average case scenario? Related: The Rise of ``Worse is Better''. (For the purpose of this question the "Worse is Better" phrase is used in a narrower (namely -- algorithmic time-complexity) sense than in the article) Python's Design Philosophy: The ABC group strived for perfection. For example, they used tree-based data structure algorithms that were proven to be optimal for asymptotically large collections (but were not so great for small collections). This example would be the answer if there were no computers capable of storing these large collections (in other words large is not large enough in this case). Coppersmith–Winograd algorithm for square matrix multiplication is a good example (it is the fastest (2008) but it is inferior to worse algorithms). Any others? From the wikipedia article: "It is not used in practice because it only provides an advantage for matrices so large that they cannot be processed by modern hardware (Robinson 2005)."

Read the article
FIlling a Java Bean tree structure from a csv flat file

- by Clem

Hi, I'm currently trying to construct a list of bean classes in Java from a flat description file formatted in csv. Concretely : Here is the structure of the csv file : MES_ID;GRP_PARENT_ID;GRP_ID;ATTR_ID M1 ; ;G1 ;A1 M1 ; ;G1 ;A2 M1 ;G1 ;G2 ;A3 M1 ;G1 ;G2 ;A4 M1 ;G2 ;G3 ;A5 M1 ; ;G4 ;A6 M1 ; ;G4 ;A7 M1 ; ;G4 ;A8 M2 ; ;G1 ;A1 M2 ; ;G1 ;A2 M2 ; ;G2 ;A3 M2 ; ;G2 ;A4 It corresponds to the hierarchical data structure : M1 ---G1 ------A1 ------A2 ------G2 ---------A3 ---------A4 ---------G3 ------------A5 ------G4 ---------A7 ---------A8 M2 ---G1 ------A1 ------A2 ---G2 ------A3 ------A4 Remarks : A message M can have an infinite number of groups G and attributes A A group G can have an infinite number of attributes and an infinite number of under-groups each of them having under-groups too That beeing said, I'm trying to read this flat csv decription to store it in this structure of beans : Map<String, MBean> messages = new HashMap<String, Mbean>(); == public class MBean { private String mes_id; private Map<String, GBean> groups; } public class GBean { private String grp_id; private Map<String, ABean> attributes; private Map<String, GBean> underGroups; } public class ABean { private String attr_id; } Reading the csv file sequentially is ok and I've been investigating how to use recursion to store the description data, but couldn't find a way. Thanks in advance for any of your algorithmic ideas. I hope it will put you in the mood of thinking about this ... I've to admit that I'm out of ideas :s

Read the article
Algorithm to match list of regular expressions

- by DSII

I have two algorithmic questions for a project I am working on. I have thought about these, and have some suspicions, but I would love to hear the community's input as well. Suppose I have a string, and a list of N regular expressions (actually they are wildcard patterns representing a subset of full regex functionality). I want to know whether the string matches at least one of the regular expressions in the list. Is there a data structure that can allow me to match the string against the list of regular expressions in sublinear (presumably logarithmic) time? This is an extension of the previous problem. Suppose I have the same situation: a string and a list of N regular expressions, only now each of the regular expressions is paired with an offset within the string at which the match must begin (or, if you prefer, each of the regular expressions must match a substring of the given string beginning at the given offset). To give an example, suppose I had the string: This is a test string and the regex patterns and offsets: (a) his.* at offset 0 (b) his.* at offset 1 The algorithm should return true. Although regex (a) does not match the string beginning at offset 0, regex (b) does match the substring beginning at offset 1 ("his is a test string"). Is there a data structure that can allow me to solve this problem in sublinear time? One possibly useful piece of information is that often, many of the offsets in the list of regular expressions are the same (i.e. often we are matching the substring at offset X many times). This may be useful to leverage the solution to problem #1 above. Thank you very much in advance for any suggestions you may have!

Read the article
Autodetect Presence of CSV Headers in a File

- by banzaimonkey

Short question: How do I automatically detect whether a CSV file has headers in the first row? Details: I've written a small CSV parsing engine that places the data into an object that I can access as (approximately) an in-memory database. The original code was written to parse third-party CSV with a predictable format, but I'd like to be able to use this code more generally. I'm trying to figure out a reliable way to automatically detect the presence of CSV headers, so the script can decide whether to use the first row of the CSV file as keys / column names or start parsing data immediately. Since all I need is a boolean test, I could easily specify an argument after inspecting the CSV file myself, but I'd rather not have to (go go automation). I imagine I'd have to parse the first 3 to ? rows of the CSV file and look for a pattern of some sort to compare against the headers. I'm having nightmares of three particularly bad cases in which: The headers include numeric data for some reason The first few rows (or large portions of the CSV) are null There headers and data look too similar to tell them apart If I can get a "best guess" and have the parser fail with an error or spit out a warning if it can't decide, that's OK. If this is something that's going to be tremendously expensive in terms of time or computation (and take more time than it's supposed to save me) I'll happily scrap the idea and go back to working on "important things". I'm working with PHP, but this strikes me as more of an algorithmic / computational question than something that's implementation-specific. If there's a simple algorithm I can use, great. If you can point me to some relevant theory / discussion, that'd be great, too. If there's a giant library that does natural language processing or 300 different kinds of parsing, I'm not interested.

Read the article
Better ways to implement a modulo operation (algorithm question)

- by ryxxui

I've been trying to implement a modular exponentiator recently. I'm writing the code in VHDL, but I'm looking for advice of a more algorithmic nature. The main component of the modular exponentiator is a modular multiplier which I also have to implement myself. I haven't had any problems with the multiplication algorithm- it's just adding and shifting and I've done a good job of figuring out what all of my variables mean so that I can multiply in a pretty reasonable amount of time. The problem that I'm having is with implementing the modulus operation in the multiplier. I know that performing repeated subtractions will work, but it will also be slow. I found out that I could shift the modulus to effectively subtract large multiples of the modulus but I think there might still be better ways to do this. The algorithm that I'm using works something like this (weird pseudocode follows): result,modulus : integer (n bits) (previously defined) shiftcount : integer (initialized to zero) while( (modulus<result) and (modulus(n-1) != 1) ){ modulus = modulus << 1 shiftcount++ } for(i=shiftcount;i>=0;i++){ if(modulus<result){result = result-modulus} if(i!=0){modulus = modulus << 1} } So...is this a good algorithm, or at least a good place to start? Wikipedia doesn't really discuss algorithms for implementing the modulo operation, and whenever I try to search elsewhere I find really interesting but incredibly complicated (and often unrelated) research papers and publications. If there's an obvious way to implement this that I'm not seeing, I'd really appreciate some feedback.

Read the article
Interactive Web Collaboration Platform

- by user1477586

I am currently unsatisfied with most, if not all, web-based collaboration tools I've ever used. Examples of what I'm considering a "web-based collaboration tool" are TWiki, WebEx, etc. What I have in mind of developing is to develop my own. The target "market" is for software/hardware developers who will likely be working on separate projects linked by a common goal. Example: You work at a company who is designing a brand new printer. So, one man is working on the ink cartridge, another on the paper feed, one on the circuitry of the printer and one more is working on the drivers. What I would like to make (unless it already exists) is a website that, upon loading, presents the user with a web that has each project within it's own bubble, graphically. These bubbles would all be linked to the central bubble of "[Company Name]". Upon selecting an individual bubble the browser would focus and zoom in on that project bubble and expand more information (in bullets or with more affiliated bubbles) about subprojects, progress, setbacks, etc. Then, at a terminal "node" (i.e. a bubble with no sub-bubbles) a selection would then load a page with information relevant to that node. That's a lot of monologue. In short, I want to know how I might approach this. My background is with algorithmic use of Python and C++. I've never done interactive web design like this; though, I have gone through the W3C tutorials on HTML4/5. I'm willing to learn a new language I'm just wondering which language/set of languages seem(s) most appropriate for this project. Thanks, world, for imparting your knowledge to me. An example of what the start page might look like is:

Read the article
Performing text processing on flatpage content to include handling of custom tag

- by Dzejkob

Hi. I'm using flatpages app in my project to manage some html content. That content will include images, so I've made a ContentImage model allowing user to upload images using admin panel. The user should then be able to include those images in content of the flatpages. He can of course do that by manually typing image url into <img> tag, but that's not what I'm looking for. To make including images more convenient, I'm thinking about something like this: User edits an additional, let's say pre_content field of CustomFlatPage model (I'm using custom flatpage model already) instead of defining <img> tags directly, he uses a custom tag, something like [img=...] where ... is name of the ContentImage instance now the hardest part: before CustomFlatPage is saved, pre_content field is checked for all [img=...] occurences and they are processed like this: ContentImage model is searched if there's image instance with given name and if so, [img=...] is replaced with proper <img> tag. flatpage actual content is filled with processed pre_content and then flatpage is saved (pre_content is leaved unchanged, as edited by user) The part that I can't cope with is text processing. Should I use regular expressions? Apparently they can be slow for large strings. And how to organize logic? I assume it's rather algorithmic question, but I'm not familliar with text processing in Python enough, to do it myself. Can somebody give me any clues?

Read the article
Quickest algorithm for finding sets with high intersection

- by conradlee

I have a large number of user IDs (integers), potentially millions. These users all belong to various groups (sets of integers), such that there are on the order of 10 million groups. To simplify my example and get to the essence of it, let's assume that all groups contain 20 user IDs (i.e., all integer sets have a cardinality of 100). I want to find all pairs of integer sets that have an intersection of 15 or greater. Should I compare every pair of sets? (If I keep a data structure that maps userIDs to set membership, this would not be necessary.) What is the quickest way to do this? That is, what should my underlying data structure be for representing the integer sets? Sorted sets, unsorted---can hashing somehow help? And what algorithm should I use to compute set intersection)? I prefer answers that relate C/C++ (especially STL), but also any more general, algorithmic insights are welcome. Update Also, note that I will be running this in parallel in a shared memory environment, so ideas that cleanly extend to a parallel solution are preferred.

Read the article
Verizon SongID - How is it programmed?

- by CheeseConQueso

For anyone not familiar with Verizon's SongID program, it is a free application downloadable through Verizon's VCast network. It listens to a song for 10 seconds at any point during the song and then sends this data to some all-knowing algorithmic beast that chews it up and sends you back all the ID3 tags (artist, album, song, etc...) The first two parts and last part are straightforward, but what goes on during the processing after the recorded sound is sent? I figure it must take the sound file (what format?), parse it (how? with what?) for some key identifiers (what are these? regular attributes of wave functions? phase/shift/amplitude/etc), and check it against a database. Everything I find online about how this works is something generic like what I typed above. From audiotag.info This service is based on a sophisticated audio recognition algorithm combining advanced audio fingerprinting technology and a large songs' database. When you upload an audio file, it is being analyzed by an audio engine. During the analysis its audio “fingerprint” is extracted and identified by comparing it to the music database. At the completion of this recognition process, information about songs with their matching probabilities are displayed on screen.

Read the article
Architecture for database analytics

- by David Cournapeau

Hi, We have an architecture where we provide each customer Business Intelligence-like services for their website (internet merchant). Now, I need to analyze those data internally (for algorithmic improvement, performance tracking, etc...) and those are potentially quite heavy: we have up to millions of rows / customer / day, and I may want to know how many queries we had in the last month, weekly compared, etc... that is the order of billions entries if not more. The way it is currently done is quite standard: daily scripts which scan the databases, and generate big CSV files. I don't like this solutions for several reasons: as typical with those kinds of scripts, they fall into the write-once and never-touched-again category tracking things in "real-time" is necessary (we have separate toolset to query the last few hours ATM). this is slow and non-"agile" Although I have some experience in dealing with huge datasets for scientific usage, I am a complete beginner as far as traditional RDBM go. It seems that using column-oriented database for analytics could be a solution (the analytics don't need most of the data we have in the app database), but I would like to know what other options are available for this kind of issues.

Read the article
How can I create photo effects in Android?

- by PaulH

I'd like to make an Android app that lets a user apply cool effects to photos taken with the camera. There are already a few out there, I know, but I'd like to try my own hand at one. I'm trying to figure out the best way to implement these effects. Here are some examples from the excellent Vignette app (which I own): http://www.flickr.com/groups/vignetteforandroid/pool/ I have been googling and stack-overflowing, but so far I've mostly found some references to published papers or books. I am ordering this one from Amazon presently - Digital Image Processing: An Algorithmic Introduction using Java After some reading, I think I have a basic understanding of manipulating the RGB values for all the pixels in the image. My main question is how do I come up with a transformation that produces cool effects? By cool effects I mean some like those in the Vignette app or IPhone apps: ToyCamera Polarize I already have quite a bit of experience with Java, and I've made my first app for android already. Any ideas? Thanks in advance.

Read the article
Three most critical programming concepts

- by Todd

I know this has probably been asked in one form or fashion but I wanted to pose it once again within the context of my situation (and probably others here @ SO). I made a career change to Software Engineering some time ago without having an undergrad or grad degree in CS. I've supplemented my undergrad and grad studies in business with programming courses (VB, Java,C, C#) but never performed academic coursework in the other related disciplines (algorithms, design patterns, discrete math, etc.)...just mostly self-study. I know there are several of you who have either performed interviews and/or made hiring decisions. Given recent trends in demand, what would you say are the three most essential Comp Sci concepts that a developer should have a solid grasp of outside of language syntax? For example, I've seen blog posts of the "Absolute minimum X that every programmer must know" variety...that's what I'm looking for. Again if it's truly a redundancy please feel free to close; my feelings won't be hurt. (Closest ones I could find were http://stackoverflow.com/questions/164048/basic-programming-algorithmic-concepts- which was geared towards a true beginner, and http://stackoverflow.com/questions/648595/essential-areas-of-knowledge-which I didn't feel was concrete enough). Thanks in advance all! T.

Read the article
Why junit ComparisonFailure is not used by assertEquals(Object, Object) ?

- by Philippe Blayo

In Junit 4, do you see any drawback to throw a ComparisonFailure instead of an AssertionError when assertEquals(Object, Object) fails ? assertEquals(Object, Object) throws a ComparisonFailure if both expected and actual are String an AssertionError if either is not a String @Test(expected=ComparisonFailure.class ) public void twoString() { assertEquals("a String", "another String"); } @Test(expected=AssertionError.class ) public void oneString() { assertEquals("a String", new Object()); } The two reasons why I ask the question: ComparisonFailure provide far more readable way to spot the differences in dialog box of eclipse or Intellij IDEA (FEST-Assert throws this exception) Junit 4 already use String.valueOf(Object) to build message "expected ... but was ..." (format method invoqued by Assert.assertEquals(message, Object, Object) in junit-4.8.2): static String format(String message, Object expected, Object actual) { ... String expectedString= String.valueOf(expected); String actualString= String.valueOf(actual); if (expectedString.equals(actualString)) return formatted + "expected: " + formatClassAndValue(expected, expectedString) +" but was: " + formatClassAndValue(actual, actualString); else return formatted +"expected:<"+ expectedString +"> but was:<"+ actualString +">"; Isn't it possible in assertEquals(message, Object, Object) to replace fail(format(message, expected, actual)); by throw new ComparisonFailure(message, formatClassAndValue(expectedObject, expectedString), formatClassAndValue(actualObject, actualString)); Do you see any compatibility issue with other tool, any algorithmic problem with that... ?

Read the article
What happens when you create an instance of an object containing no state in C#?

- by liquorice

I am I think ok at algorithmic programming, if that is the right term? I used to play with turbo pascal and 8086 assembly language back in the 1980s as a hobby. But only very small projects and I haven't really done any programming in the 20ish years since then. So I am struggling for understanding like a drowning swimmer. So maybe this is a very niave question or I'm just making no sense at all, but say I have an object kind of like this: class Something : IDoer { void Do(ISomethingElse x) { x.DoWhatEverYouWant(42); } } And then I do var Thing1 = new Something(); var Thing2 = new Something(); Thing1.Do(blah); Thing2.Do(blah); does Thing1 = Thing2? does "new Something()" create anything? Or is it not much different different from having a static class, except I can pass it around and swap it out etc. Is the "Do" procedure in the same location in memory for both the Thing1(blah) and Thing2(blah) objects? I mean when executing it, does it mean there are two Something.Do procedures or just one?

Read the article

Search Results

Search found 294 results on 12 pages for 'algorithmic trading'.

Page 9/12 | < Previous Page | 5 6 7 8 9 10 11 12 | Next Page >

- by WeShallOvercome

- by Chris Marisic

- by user13819847

- by Tori Wieldt

- by Steven Chan (Oracle Development)

- by Kyle Burns

- by Fraser

- by Shub Lahiri, A-Team

- by Tim

- by Kal

- by jqno

- by J.F. Sebastian

- by Clem

- by DSII

- by banzaimonkey

- by ryxxui

- by user1477586

- by Dzejkob

- by conradlee

- by CheeseConQueso

- by David Cournapeau

- by PaulH

- by Todd

- by Philippe Blayo

- by liquorice

< Previous Page | 5 6 7 8 9 10 11 12 | Next Page >