Search Results

Search found 5 results on 1 pages for 'elliot42'.

Page 1/1 | 1

Duck checker in Python: does one exist?

- by elliot42

Python uses duck-typing, rather than static type checking. But many of the same concerns ultimately apply: does an object have the desired methods and attributes? Do those attributes have valid, in-range values? Whether you're writing constraints in code, or writing test cases, or validating user input, or just debugging, inevitably somewhere you'll need to verify that an object is still in a proper state--that it still "looks like a duck" and "quacks like a duck." In statically typed languages you can simply declare "int x", and anytime you create or mutate x, it will always be a valid int. It seems feasible to decorate a Python object to ensure that it is valid under certain constraints, and that every time that object is mutated it is still valid under those constraints. Ideally there would be a simple declarative syntax to express "hasattr length and length is non-negative" (not in those words. Not unlike Rails validators, but less human-language and more programming-language). You could think of this as ad-hoc interface/type system, or you could think of it as an ever-present object-level unit test. Does such a library exist to declare and validate constraint/duck-checking on Python-objects? Is this an unreasonable tool to want? :) (Thanks!) Contrived example: rectangle = {'length': 5, 'width': 10} # We live in a fictional universe where multiplication is super expensive. # Therefore any time we multiply, we need to cache the results. def area(rect): if 'area' in rect: return rect['area'] rect['area'] = rect['length'] * rect['width'] return rect['area'] print area(rectangle) rectangle['length'] = 15 print area(rectangle) # compare expected vs. actual output! # imagine the same thing with object attributes rather than dictionary keys.

Read the article
Can Clojure's thread-based agents handle c10k performance?

- by elliot42

I'm writing a c10k-style service and am trying to evaluate Clojure's performance. Can Clojure agents handle this scale of concurrency with its thread-based agents? Other high performance systems seem to be moving towards async-IO/events/greenlets, albeit at a seemingly higher complexity cost. Suppose there are 10,000 clients connected, sending messages that should be appended to 1,000 local files--the Clojure service is trying to write to as many files in parallel as it can, while not letting any two separate requests mangle the same single file by writing at the same time. Clojure agents are extremely elegant conceptually--they would allow separate files to be written independently and asynchronously, while serializing (in the database sense) multiple requests to write to the same file. My understanding is that agents work by starting a thread for each operation (assume we are IO-bound and using send-off)--so in this case is it correct that it would start 1,000+ threads? Can current-day systems handle this number of threads efficiently? Most of them should be IO-bound and sleeping most of the time, but I presume there would still be a context-switching penalty that is theoretically higher than async-IO/event-based systems (e.g. Erlang, Go, node.js). If the Clojure solution can handle the performance, it seems like the most elegant thing to code. However if it can't handle the performance then something like Erlang or Go's lightweight processes might be preferable, since they are designed to have tens of thousands of them spawned at once, and are only moderately more complex to implement. Has anyone approached this problem in Clojure or compared to these other platforms? (Thanks for your thoughts!)

Read the article
Visually and audibly unambiguous subset of the Latin alphabet?

- by elliot42

Imagine you give someone a card with the code "5SBDO0" on it. In some fonts, the letter "S" is difficult to visually distinguish from the number five, (as with number zero and letter "O"). Reading the code out loud, it might be difficult to distinguish "B" from "D", necessitating saying "B as in boy," "D as in dog," or using a "phonetic alphabet" instead. What's the biggest subset of letters and numbers that will, in most cases, both look unambiguous visually and sound unambiguous when read aloud? Background: We want to generate a short string that can encode as many values as possible while still being easy to communicate. Imagine you have a 6-character string, "123456". In base 10 this can encode 10^6 values. In hex "1B23DF" you can encode 16^6 values in the same number of characters, but this can sound ambiguous when read aloud. ("B" vs. "D") Likewise for any string of N characters, you get (size of alphabet)^N values. The string is limited to a length of about six characters, due to wanting to fit easily within the capacity of human working memory capacity. Thus to find the max number of values we can encode, we need to find that largest unambiguous set of letters/numbers. There's no reason we can't consider the letters G-Z, and some common punctuation, but I don't want to have to go manually pairwise compare "does G sound like A?", "does G sound like B?", "does G sound like C" myself. As we know this would be O(n^2) linguistic work to do =)...

Read the article
Data architecture for event log metrics?

- by elliot42

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D." We are trying to make two basic decisions: What to store? Storing every event vs. only storing aggregates (Event log style) log every event and count them later, vs. (Time-series style) store a single aggregated "count of event E for date D" for every day Where to store the data In a relational database (particularly MySQL) In a non-relational (NoSQL) database In flat log files (collected centrally over the network via syslog-ng) What is standard practice / where can I read more about comparing the different types of systems? Additional details: The total event stream is large, potentially hundreds of thousands of entries per day But our current need is only to count certain types of events within it We don't necessarily need real-time access to the raw data or aggregation results IMHO, "log all events to files, crawl them at a later time to filter and aggregate the stream" is a pretty standard UNIX Way, but my Rails-y compatriots seem to think that nothing is real unless it's in MySQL.

Read the article
What's the right way to start a node.js service?

- by elliot42

I'm running a node.js service (statsd) on CentOS 6. What's the proper way to daemonize and start such a service? Potential Daemonizers--are daemonizers supposed to be language-specific or general?: forever (node-specific) daemonize nohup (presumably wrong) start-stop-daemon(debian-only? is this for daemonizing or starting/stopping? what is the Centos equivalent?) Should the app itself really know how to daemonize itself and then have a -d flag? (e.g. via node-daemonize2 or forever-monitor?) Service starters--should these be from the system/distro, or should they be from monitoring tools such as monit?: service? is really /etc/init.d on CentOS? service? is really Upstart on Ubuntu? monit? daemontools? runit? I'm unfortunately new to this--where can I read up on what is the most standard, classic, reliable way of doing this?

Read the article

1