Search Results

Search found 19 results on 1 pages for 'dfa'.

Page 1/1 | 1 

  • Problem with creating a deterministic finite automata (DFA) - Mercury

    - by Jabba The hut
    I would like to have a deterministic finite automata (DFA) simulated in Mercury. But I’m s(t)uck at several places. Formally, a DFA is described with the following characteristics: a setOfStates S, an inputAlphabet E <-- summation symbol, a transitionFunction : S × E -- S, a startState s € S, a setOfAcceptableFinalStates F =C S. A DFA will always starts in the start state. Then the DFA will read all the characters on the input, one by one. Based on the current input character and the current state, there will be made to a new state. These transitions are defined in the transitions function. when the DFA is in one of his acceptable final states, after reading the last character, then will the DFA accept the input, If not, then the input will be is rejected. The figure shows a DFA the accepting strings where the amount of zeros, is a plurality of three. Condition 1 is the initial state, and also the only acceptable state. for each input character is the corresponding arc followed to the next state. Link to Figure What must be done A type “mystate” which represents a state. Each state has a number which is used for identification. A type “transition” that represents a possible transition between states. Each transition has a source_state, an input_character, and a final_state. A type “statemachine” that represents the entire DFA. In the solution, the DFA must have the following properties: The set of all states, the input alphabet, a transition function, represented as a set of possible transitions, a set of accepting final states, a current state of the DFA A predicate “init_machine (state machine :: out)” which unifies his arguments with the DFA, as shown as in the Figure. The current state for the DFA is set to his initial state, namely, 1. The input alphabet of the DFA is composed of the characters '0'and '1'. A user can enter a text, which will be controlled by the DFA. the program will continues until the user types Ctrl-D and simulates an EOF. If the user use characters that are not allowed into the input alphabet of the DFA, then there will be an error message end the program will close. (pred require) Example Enter a sentence: 0110 String is not ok! Enter a sentence: 011101 String is not ok! Enter a sentence: 110100 String is ok! Enter a sentence: 000110010 String is ok! Enter a sentence: 011102 Uncaught exception Mercury: Software Error: Character does not belong to the input alphabet! the thing wat I have. :- module dfa. :- interface. :- import_module io. :- pred main(io.state::di, io.state::uo) is det. :- implementation. :- import_module int,string,list,bool. 1 :- type mystate ---> state(int). 2 :- type transition ---> trans(source_state::mystate, input_character::bool, final_state::mystate). 3 (error, finale_state and current_state and input_character) :- type statemachine ---> dfa(list(mystate),list(input_character),list(transition),list(final_state),current_state(mystate)) 4 missing a lot :- pred init_machine(statemachine :: out) is det. %init_machine(statemachine(L_Mystate,0,L_transition,L_final_state,1)) :- <-probably fault 5 not perfect main(!IO) :- io.write_string("\nEnter a sentence: ", !IO), io.read_line_as_string(Input, !IO), ( Invoer = ok(StringVar), S1 = string.strip(StringVar), (if S1 = "mustbeabool" then io.write_string("Sentenceis Ok! ", !IO) else io.write_string("Sentence is not Ok!.", !IO)), main(!IO) ; Invoer = eof ; Invoer = error(ErrorCode), io.format("%s\n", [s(io.error_message(ErrorCode))], !IO) ). Hope you can help me kind regards

    Read the article

  • questions on nfa and dfa..

    - by Loop
    Hi Guys... Hope you help me with this one.... I have a main question which is ''how to judge whether a regular expression will be accepted by NFA and/or DFA? For eg. My question says that which of the regular expressions are equivalent? explain... 1.(a+b)*b(a+b)*b(a+b)* 2.a*ba*ba* 3.a*ba*b(a+b)* do we have to draw the NFA and DFA and then find through minimisation algorithm? if we do then how do we come to know that which regular expression is accepted by NFA/DFA so that we can begin with the answer? its so confusing.... Second is a very similar one, the question asks me to show that the language (a^nb^n|n1} is not accepted by DFA...grrrrr...how do i know this? (BTW this is a set of all strings of where a number of a's is followed by the same number of b's).... I hope I explained clearly well....

    Read the article

  • How to read reduce/shift conflicts in LR(1) DFA?

    - by greenoldman
    I am reading an explanation (awesome "Parsing Techniques" by D.Grune and C.J.H.Jacobs; p.293 in the 2nd edition) and I moved forward from my last question: How to get lookahead symbol when constructing LR(1) NFA for parser? Now I have such "problem" (maybe not a problem, but rather need of confirmation from some more knowledgeable people). The authors present state in LR(0) which has reduce/shift conflict. Then they build DFA for LR(1) for the same grammar. And now they say it does not have a conflict (lookaheads at the end): S -> E . eof E -> E . - T eof E -> E . - T - and there is an edge from this state labeled - but no labeled eof. Authors says, that on eof there will be reduce, on - there will be shift. However eof is for shift as well (as lookahead). So my personal understanding of LR(1) DFA is this -- you can drop lookaheads for shifts, because they serve no purpose now -- shifts rely on input, not on lookaheads -- and after that, remove duplicates. S -> E . eof E -> E . - T So the lookahead for reduce serves as input really, because at this stage (all required input is read) it is really incoming symbol right now. For shifts, the input symbols are on the edges. So my question is this -- am I actually right about dropping lookaheads for shifts (after fully constructing DFA)?

    Read the article

  • How lookaheads are propagated in "channel" method of building LALR parser?

    - by greenoldman
    The method is described in Dragon Book, however I read about it in ""Parsing Techniques" by D.Grune and C.J.H.Jacobs". I start from my understanding of building channels for NFA: channels are built once, they are like water channels with current you "drop" lookahead symbols in right places (sources) of the channel, and they propagate with "current" when symbol propagates, there are no barriers (the only sufficient things for propagation are presence of channel and direction/current); i.e. lookahead cannot just die out of the blue Is that right? If I am correct, then eof lookahead should be present in all states, because the source of it is the start production, and all other production states are reachable from start state. How the DFA is made out of this NFA is not perfectly clear for me -- the authors of the mentioned book write about preserving channels, but I see no purpose, if you propagated lookaheads. If the channels have to be preserved, are they cut off from the source if the DFA state does not include source NFA state? I assume no -- the channels still runs between DFA states, not only within given DFA state. In the effect eof should still be present in all items in all states. But when you take a look at DFA presented in book (pdf is from errata): DFA for LALR (fig. 9.34 in the book, p.301) you will see there are items without eof in lookahead. The grammar for this DFA is: S -> E E -> E - T E -> T T -> ( E ) T -> n So how it was computed, when eof was dropped, and on what condition? Update It is textual pdf, so two interesting states (in DFA; # is eof): State 1: S--- >•E[#] E--- >•E-T[#-] E--- >•T[#-] T--- >•n[#-] T--- >•(E)[#-] State 6: T--- >(•E)[#-)] E--- >•E-T[-)] E--- >•T[-)] T--- >•n[-)] T--- >•(E)[-)] Arc from 1 to 6 is labeled (.

    Read the article

  • A compiler for automata theory

    - by saadtaame
    I'm designing a programming language for automata theory. My goal is to allow programmers to use machines (DFA, NFA, etc...) as units in expressions. I'm confused whether the language should be compiled, interpreted, or jit-compiled! My intuition is that compilation is a good choice, for some operations might take too much time (converting NFA's to equivalent DFA's can be expensive). Translating to x86 seems good. There is one issue however: I want the user to be able to plot machines. Any ideas?

    Read the article

  • Efficient (basic) regular expression implementation for streaming data

    - by Brendan Dolan-Gavitt
    I'm looking for an implementation of regular expression matching that operates on a stream of data -- i.e., it has an API that allows a user to pass in one character at a time and report when a match is found on the stream of characters seen so far. Only very basic (classic) regular expressions are needed, so a DFA/NFA based implementation seems like it would be well-suited to the problem. Based on the fact that it's possible to do regular expression matching using a DFA/NFA in a single linear sweep, it seems like a streaming implementation should be possible. Requirements: The library should try to wait until the full string has been read before performing the match. The data I have really is streaming; there is no way to know how much data will arrive, it's not possible to seek forward or backward. Implementing specific stream matching for a couple special cases is not an option, as I don't know in advance what patterns a user might want to look for. For the curious, my use case is the following: I have a system which intercepts memory writes inside a full system emulator, and I would like to have a way to identify memory writes that match a regular expression (e.g., one could use this to find the point in the system where a URL is written to memory). I have found (links de-linkified because I don't have enough reputation): stackoverflow.com/questions/1962220/apply-a-regex-on-stream stackoverflow.com/questions/716927/applying-a-regular-expression-to-a-java-i-o-stream www.codeguru.com/csharp/csharp/cs_data/searching/article.php/c14689/Building-a-Regular-Expression-Stream-Search-with-the-NET-Framework.htm But all of these attempt to convert the stream to a string first and then use a stock regular expression library. Another thought I had was to modify the RE2 library, but according to the author it is architected around the assumption that the entire string is in memory at the same time. If nothing's available, then I can start down the unhappy path of reinventing this wheel to fit my own needs, but I'd really rather not if I can avoid it. Any help would be greatly appreciated!

    Read the article

  • Efficient mass string search problem.

    - by Monomer
    The Problem: A large static list of strings is provided. A pattern string comprised of data and wildcard elements (* and ?). The idea is to return all the strings that match the pattern - simple enough. Current Solution: I'm currently using a linear approach of scanning the large list and globbing each entry against the pattern. My Question: Are there any suitable data structures that I can store the large list into such that the search's complexity is less than O(n)? Perhaps something akin to a suffix-trie? I've also considered using bi- and tri-grams in a hashtable, but the logic required in evaluating a match based on a merge of the list of words returned and the pattern is a nightmare, and I'm not convinced its the correct approach.

    Read the article

  • Using Closure Properties to prove Regularity

    - by WATWF
    Here's a homework problem: Is L_4 Regular? Let L_4 = L*, where L={0^i1^i | i>=1}. I know L is non-regular and I know that Kleene Star is a closed operation, so my assumption is that L_4 is non-regular. However my professor provided an example of the above in which L = {0^p | p is prime}, which he said was regular by proving that L* was equal to L(000* + e) by saying each was a subset of one another (e in this case means the empty word). So his method involved forming a regex of 0^p, but how I can do that when I essentially have one already?

    Read the article

  • Alternatives to userfly.com

    - by dfa
    During my master thesis I need to study how my users interact with my webapp. There are alternatives to userfly.com? I want just to know how I can do some usability testing without much hassle. Requests: must work under https cheap unobtrusive, if possible

    Read the article

  • GAE/J: unable to register a custom ELResolver

    - by dfa
    I need to register a custom ELResolver for a Google App Engine project. Since it must be registered before any request is received, as specified by the Javadoc: It is illegal to register an ELResolver after the application has received any request from the client. If an attempt is made to register an ELResolver after that time, an IllegalStateException is thrown. I'm using a ServletContextListener: public class RegisterCustomELResolver implements ServletContextListener { @Override public void contextInitialized(ServletContextEvent sce) { ServletContext context = sce.getServletContext(); JspApplicationContext jspContext = JspFactory.getDefaultFactory().getJspApplicationContext(context); jspContext.addELResolver(new MyELResolver()); } ... } The problem is that JspFactory.getDefaultFactory() returns always null. I've alreay filled a bug report. Any idea for a workaround?

    Read the article

  • How do I get an Enter USB TV Box TV tuner aka Gadmei UTV302 to work?

    - by Subhash
    Has anyone had any success in using the Enter USB TV Box from Enter Multimedia? It comes bundled with software that works in Windows. I have had no luck using it in Ubuntu 10.10. Update 1 Here is the output from lsusb Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 003: ID 093a:2510 Pixart Imaging, Inc. Optical Mouse Bus 004 Device 002: ID 046d:c312 Logitech, Inc. DeLuxe 250 Keyboard Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 006: ID 1f71:3301 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub I can't find the Enter USB TV Box listed in this. In the dmesg tail command, I found something that seems to be related to the card: usb 1-5: new high speed USB device using ehci_hcd and address 6 usb 1-5: config 1 interface 0 altsetting 1 bulk endpoint 0x83 has invalid maxpacket 256 Update 2 From Windows I learned that this USB TV tuner uses some chipset from Gadmei corporation. All computer stores in India sell Enter USB TV Box if you ask for an USB TV tuner. No other brand seems to be interested in this market. Update 3 I learned that this TV tuner is rebranded version of Gadmei UTV302 (USB TV Tuner Box). Update 4 I tried adding em28xx as the chipset (as suggested by user BOBBO below) for the tuner but that did not work. I went back to my Pinnacle PCTV internal card. I don't think the tuner referred by UbuntuForums (Gadmei UTV 330) and the tuner that I have (Gadmei UTV 302) are the same. My USB tuner is several times bigger. My tuner seems to be a newer device with a newer tuner chip. I will submit details of this device to the LinuxTV developers this weekend. Update 5 I opened the tuner box and found that it uses a tuner from a Chinese company - Tenas. Model is TNF 8022-DFA. Update 6 Tuner chip specs (retrived from supplier directory) for Tenas TNF 8022-DFA. Supply voltage: true 5V device(low power dissipation) Control system: I2C bus control of tuning, address selection Tuning system: PLL controlled tuning Receiving system: system PAL D/K,IF(Intermediate Frequency): 38MHz Receiving channels: full frequency range from channel DS1 (49.75MHz) to channel DS57 (863.25MHz); Use Texas Instruments SN761678 IC solution, with mini install size Update 7 Reverse side of the circuit board. Picture of the TV tuner

    Read the article

  • snort analysis of wireshark capture

    - by Ben Voigt
    I'm trying to identify trouble users on our network. ntop identifies high traffic and high connection users, but malware doesn't always need high bandwidth to really mess things up. So I am trying to do offline analysis with snort (don't want to burden the router with inline analysis of 20 Mbps traffic). Apparently snort provides a -r option for this purpose, but I can't get the analysis to run. The analysis system is gentoo, amd64, in case that makes any difference. I've already used oinkmaster to download the latest IDS signatures. But when I try to run snort, I keep getting the following error: % snort -V ,,_ -*> Snort! <*- o" )~ Version 2.9.0.3 IPv6 GRE (Build 98) x86_64-linux '''' By Martin Roesch & The Snort Team: http://www.snort.org/snort/snort-team Copyright (C) 1998-2010 Sourcefire, Inc., et al. Using libpcap version 1.1.1 Using PCRE version: 8.11 2010-12-10 Using ZLIB version: 1.2.5 %> snort -v -r jan21-for-snort.cap -c /etc/snort/snort.conf -l ~/snortlog/ (snip) 273 out of 1024 flowbits in use. [ Port Based Pattern Matching Memory ] +- [ Aho-Corasick Summary ] ------------------------------------- | Storage Format : Full-Q | Finite Automaton : DFA | Alphabet Size : 256 Chars | Sizeof State : Variable (1,2,4 bytes) | Instances : 314 | 1 byte states : 304 | 2 byte states : 10 | 4 byte states : 0 | Characters : 69371 | States : 58631 | Transitions : 3471623 | State Density : 23.1% | Patterns : 3020 | Match States : 2934 | Memory (MB) : 29.66 | Patterns : 0.36 | Match Lists : 0.77 | DFA | 1 byte states : 1.37 | 2 byte states : 26.59 | 4 byte states : 0.00 +---------------------------------------------------------------- [ Number of patterns truncated to 20 bytes: 563 ] ERROR: Can't find pcap DAQ! Fatal Error, Quitting.. net-libs/daq is installed, but I don't even want to capture traffic, I just want to process the capture file. What configuration options should I be setting/unsetting in order to do offline analysis instead of real-time capture?

    Read the article

  • Convert regular expression to CFG

    - by user242581
    How can I convert some regular language to its equivalent Context Free Grammar(CFG)? Whether the DFA corresponding to that regular expression is required to be constructed or is there some rule for the above conversion? For example, considering the following regular expression 01+10(11)* How can I describe the grammar corresponding to the above RE?

    Read the article

  • Python regular expression implementation details

    - by Tom
    A question that I answered got me wondering: How are regular expressions implemented in Python? What sort of efficiency guarantees are there? Is the implementation "standard", or is it subject to change? I thought that regular expressions would be implemented as DFAs, and therefore were very efficient (requiring at most one scan of the input string). Laurence Gonsalves raised an interesting point that not all Python regular expressions are regular. (His example is r"(a+)b\1", which matches some number of a's, a b, and then the same number of a's as before). This clearly cannot be implemented with a DFA. So, to reiterate: what are the implementation details and guarantees of Python regular expressions? It would also be nice if someone could give some sort of explanation (in light of the implementation) as to why the regular expressions "cat|catdog" and "catdog|cat" lead to different search results in the string "catdog", as mentioned in the question that I referenced before.

    Read the article

  • What should I call a class that contains a sequence of states

    - by Robert P
    I have a GUI tool that manages state sequences. One component is a class that contains a set of states, your typical DFA state machine. For now, I'll call this a StateSet. However, I have another class that has a collection (possibly partially unordered) of those state sets, and lists them in a particular order. and I'm trying to come up with a good name for it - not just for internal code, but for customers to refer to it. I've got: Sequence (maybe) StateSetSet (reasonable for code, but not for customers) Any other suggestions or ideas?

    Read the article

  • Returning a href within a string

    - by user701254
    How can I return a href within a string, I can access the start position but not sure how to get last position : Here is what I have so far : String str = "sadf ad fas dfa http:\\www.google.com sdfa sadf as dfas"; int index = str.indexOf("http"); String href = str.substring(index , ???); What should the end index be ? Note, this is targeted at j2me & I need to minimise download footprint so I cannot use regular expressions or third party regular expressions libraries.

    Read the article

  • Squid + Dans Guardian (simple configuration)

    - by The Digital Ninja
    I just built a new proxy server and compiled the latest versions of squid and dansguardian. We use basic authentication to select what users are allowed outside of our network. It seems squid is working just fine and accepts my username and password and lets me out. But if i connect to dans guardian, it prompts for username and password and then displays a message saying my username is not allowed to access the internet. Its pulling my username for the error message so i know it knows who i am. The part i get confused on is i thought that part was handled all by squid, and squid is working flawlessly. Can someone please double check my config files and tell me if i'm missing something or there is some new option i must set to get this to work. dansguardian.conf # Web Access Denied Reporting (does not affect logging) # # -1 = log, but do not block - Stealth mode # 0 = just say 'Access Denied' # 1 = report why but not what denied phrase # 2 = report fully # 3 = use HTML template file (accessdeniedaddress ignored) - recommended # reportinglevel = 3 # Language dir where languages are stored for internationalisation. # The HTML template within this dir is only used when reportinglevel # is set to 3. When used, DansGuardian will display the HTML file instead of # using the perl cgi script. This option is faster, cleaner # and easier to customise the access denied page. # The language file is used no matter what setting however. # languagedir = '/etc/dansguardian/languages' # language to use from languagedir. language = 'ukenglish' # Logging Settings # # 0 = none 1 = just denied 2 = all text based 3 = all requests loglevel = 3 # Log Exception Hits # Log if an exception (user, ip, URL, phrase) is matched and so # the page gets let through. Can be useful for diagnosing # why a site gets through the filter. on | off logexceptionhits = on # Log File Format # 1 = DansGuardian format 2 = CSV-style format # 3 = Squid Log File Format 4 = Tab delimited logfileformat = 1 # Log file location # # Defines the log directory and filename. #loglocation = '/var/log/dansguardian/access.log' # Network Settings # # the IP that DansGuardian listens on. If left blank DansGuardian will # listen on all IPs. That would include all NICs, loopback, modem, etc. # Normally you would have your firewall protecting this, but if you want # you can limit it to only 1 IP. Yes only one. filterip = # the port that DansGuardian listens to. filterport = 8080 # the ip of the proxy (default is the loopback - i.e. this server) proxyip = 127.0.0.1 # the port DansGuardian connects to proxy on proxyport = 3128 # accessdeniedaddress is the address of your web server to which the cgi # dansguardian reporting script was copied # Do NOT change from the default if you are not using the cgi. # accessdeniedaddress = 'http://YOURSERVER.YOURDOMAIN/cgi-bin/dansguardian.pl' # Non standard delimiter (only used with accessdeniedaddress) # Default is enabled but to go back to the original standard mode dissable it. nonstandarddelimiter = on # Banned image replacement # Images that are banned due to domain/url/etc reasons including those # in the adverts blacklists can be replaced by an image. This will, # for example, hide images from advert sites and remove broken image # icons from banned domains. # 0 = off # 1 = on (default) usecustombannedimage = 1 custombannedimagefile = '/etc/dansguardian/transparent1x1.gif' # Filter groups options # filtergroups sets the number of filter groups. A filter group is a set of content # filtering options you can apply to a group of users. The value must be 1 or more. # DansGuardian will automatically look for dansguardianfN.conf where N is the filter # group. To assign users to groups use the filtergroupslist option. All users default # to filter group 1. You must have some sort of authentication to be able to map users # to a group. The more filter groups the more copies of the lists will be in RAM so # use as few as possible. filtergroups = 1 filtergroupslist = '/etc/dansguardian/filtergroupslist' # Authentication files location bannediplist = '/etc/dansguardian/bannediplist' exceptioniplist = '/etc/dansguardian/exceptioniplist' banneduserlist = '/etc/dansguardian/banneduserlist' exceptionuserlist = '/etc/dansguardian/exceptionuserlist' # Show weighted phrases found # If enabled then the phrases found that made up the total which excedes # the naughtyness limit will be logged and, if the reporting level is # high enough, reported. on | off showweightedfound = on # Weighted phrase mode # There are 3 possible modes of operation: # 0 = off = do not use the weighted phrase feature. # 1 = on, normal = normal weighted phrase operation. # 2 = on, singular = each weighted phrase found only counts once on a page. # weightedphrasemode = 2 # Positive result caching for text URLs # Caches good pages so they don't need to be scanned again # 0 = off (recommended for ISPs with users with disimilar browsing) # 1000 = recommended for most users # 5000 = suggested max upper limit urlcachenumber = # # Age before they are stale and should be ignored in seconds # 0 = never # 900 = recommended = 15 mins urlcacheage = # Smart and Raw phrase content filtering options # Smart is where the multiple spaces and HTML are removed before phrase filtering # Raw is where the raw HTML including meta tags are phrase filtered # CPU usage can be effectively halved by using setting 0 or 1 # 0 = raw only # 1 = smart only # 2 = both (default) phrasefiltermode = 2 # Lower casing options # When a document is scanned the uppercase letters are converted to lower case # in order to compare them with the phrases. However this can break Big5 and # other 16-bit texts. If needed preserve the case. As of version 2.7.0 accented # characters are supported. # 0 = force lower case (default) # 1 = do not change case preservecase = 0 # Hex decoding options # When a document is scanned it can optionally convert %XX to chars. # If you find documents are getting past the phrase filtering due to encoding # then enable. However this can break Big5 and other 16-bit texts. # 0 = disabled (default) # 1 = enabled hexdecodecontent = 0 # Force Quick Search rather than DFA search algorithm # The current DFA implementation is not totally 16-bit character compatible # but is used by default as it handles large phrase lists much faster. # If you wish to use a large number of 16-bit character phrases then # enable this option. # 0 = off (default) # 1 = on (Big5 compatible) forcequicksearch = 0 # Reverse lookups for banned site and URLs. # If set to on, DansGuardian will look up the forward DNS for an IP URL # address and search for both in the banned site and URL lists. This would # prevent a user from simply entering the IP for a banned address. # It will reduce searching speed somewhat so unless you have a local caching # DNS server, leave it off and use the Blanket IP Block option in the # bannedsitelist file instead. reverseaddresslookups = off # Reverse lookups for banned and exception IP lists. # If set to on, DansGuardian will look up the forward DNS for the IP # of the connecting computer. This means you can put in hostnames in # the exceptioniplist and bannediplist. # It will reduce searching speed somewhat so unless you have a local DNS server, # leave it off. reverseclientiplookups = off # Build bannedsitelist and bannedurllist cache files. # This will compare the date stamp of the list file with the date stamp of # the cache file and will recreate as needed. # If a bsl or bul .processed file exists, then that will be used instead. # It will increase process start speed by 300%. On slow computers this will # be significant. Fast computers do not need this option. on | off createlistcachefiles = on # POST protection (web upload and forms) # does not block forms without any file upload, i.e. this is just for # blocking or limiting uploads # measured in kibibytes after MIME encoding and header bumph # use 0 for a complete block # use higher (e.g. 512 = 512Kbytes) for limiting # use -1 for no blocking #maxuploadsize = 512 #maxuploadsize = 0 maxuploadsize = -1 # Max content filter page size # Sometimes web servers label binary files as text which can be very # large which causes a huge drain on memory and cpu resources. # To counter this, you can limit the size of the document to be # filtered and get it to just pass it straight through. # This setting also applies to content regular expression modification. # The size is in Kibibytes - eg 2048 = 2Mb # use 0 for no limit maxcontentfiltersize = # Username identification methods (used in logging) # You can have as many methods as you want and not just one. The first one # will be used then if no username is found, the next will be used. # * proxyauth is for when basic proxy authentication is used (no good for # transparent proxying). # * ntlm is for when the proxy supports the MS NTLM authentication # protocol. (Only works with IE5.5 sp1 and later). **NOT IMPLEMENTED** # * ident is for when the others don't work. It will contact the computer # that the connection came from and try to connect to an identd server # and query it for the user owner of the connection. usernameidmethodproxyauth = on usernameidmethodntlm = off # **NOT IMPLEMENTED** usernameidmethodident = off # Preemptive banning - this means that if you have proxy auth enabled and a user accesses # a site banned by URL for example they will be denied straight away without a request # for their user and pass. This has the effect of requiring the user to visit a clean # site first before it knows who they are and thus maybe an admin user. # This is how DansGuardian has always worked but in some situations it is less than # ideal. So you can optionally disable it. Default is on. # As a side effect disabling this makes AD image replacement work better as the mime # type is know. preemptivebanning = on # Misc settings # if on it adds an X-Forwarded-For: <clientip> to the HTTP request # header. This may help solve some problem sites that need to know the # source ip. on | off forwardedfor = on # if on it uses the X-Forwarded-For: <clientip> to determine the client # IP. This is for when you have squid between the clients and DansGuardian. # Warning - headers are easily spoofed. on | off usexforwardedfor = off # if on it logs some debug info regarding fork()ing and accept()ing which # can usually be ignored. These are logged by syslog. It is safe to leave # it on or off logconnectionhandlingerrors = on # Fork pool options # sets the maximum number of processes to sporn to handle the incomming # connections. Max value usually 250 depending on OS. # On large sites you might want to try 180. maxchildren = 180 # sets the minimum number of processes to sporn to handle the incomming connections. # On large sites you might want to try 32. minchildren = 32 # sets the minimum number of processes to be kept ready to handle connections. # On large sites you might want to try 8. minsparechildren = 8 # sets the minimum number of processes to sporn when it runs out # On large sites you might want to try 10. preforkchildren = 10 # sets the maximum number of processes to have doing nothing. # When this many are spare it will cull some of them. # On large sites you might want to try 64. maxsparechildren = 64 # sets the maximum age of a child process before it croaks it. # This is the number of connections they handle before exiting. # On large sites you might want to try 10000. maxagechildren = 5000 # Process options # (Change these only if you really know what you are doing). # These options allow you to run multiple instances of DansGuardian on a single machine. # Remember to edit the log file path above also if that is your intention. # IPC filename # # Defines IPC server directory and filename used to communicate with the log process. ipcfilename = '/tmp/.dguardianipc' # URL list IPC filename # # Defines URL list IPC server directory and filename used to communicate with the URL # cache process. urlipcfilename = '/tmp/.dguardianurlipc' # PID filename # # Defines process id directory and filename. #pidfilename = '/var/run/dansguardian.pid' # Disable daemoning # If enabled the process will not fork into the background. # It is not usually advantageous to do this. # on|off ( defaults to off ) nodaemon = off # Disable logging process # on|off ( defaults to off ) nologger = off # Daemon runas user and group # This is the user that DansGuardian runs as. Normally the user/group nobody. # Uncomment to use. Defaults to the user set at compile time. # daemonuser = 'nobody' # daemongroup = 'nobody' # Soft restart # When on this disables the forced killing off all processes in the process group. # This is not to be confused with the -g run time option - they are not related. # on|off ( defaults to off ) softrestart = off maxcontentramcachescansize = 2000 maxcontentfilecachescansize = 20000 downloadmanager = '/etc/dansguardian/downloadmanagers/default.conf' authplugin = '/etc/dansguardian/authplugins/proxy-basic.conf' Squid.conf http_port 3128 hierarchy_stoplist cgi-bin ? acl QUERY urlpath_regex cgi-bin \? cache deny QUERY acl apache rep_header Server ^Apache #broken_vary_encoding allow apache access_log /squid/var/logs/access.log squid hosts_file /etc/hosts auth_param basic program /squid/libexec/ncsa_auth /squid/etc/userbasic.auth auth_param basic children 5 auth_param basic realm proxy auth_param basic credentialsttl 2 hours auth_param basic casesensitive off refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern . 0 20% 4320 acl NoAuthNec src <HIDDEN FOR SECURITY> acl BrkRm src <HIDDEN FOR SECURITY> acl Dials src <HIDDEN FOR SECURITY> acl Comps src <HIDDEN FOR SECURITY> acl whsws dstdom_regex -i .opensuse.org .novell.com .suse.com mirror.mcs.an1.gov mirrors.kernerl.org www.suse.de suse.mirrors.tds.net mirrros.usc.edu ftp.ale.org suse.cs.utah.edu mirrors.usc.edu mirror.usc.an1.gov linux.nssl.noaa.gov noaa.gov .kernel.org ftp.ale.org ftp.gwdg.de .medibuntu.org mirrors.xmission.com .canonical.com .ubuntu. acl opensites dstdom_regex -i .mbsbooks.com .bowker.com .usps.com .usps.gov .ups.com .fedex.com go.microsoft.com .microsoft.com .apple.com toolbar.msn.com .contacts.msn.com update.services.openoffice.org fms2.pointroll.speedera.net services.wmdrm.windowsmedia.com windowsupdate.com .adobe.com .symantec.com .vitalbook.com vxn1.datawire.net vxn.datawire.net download.lavasoft.de .download.lavasoft.com .lavasoft.com updates.ls-servers.com .canadapost. .myyellow.com minirick symantecliveupdate.com wm.overdrive.com www.overdrive.com productactivation.one.microsoft.com www.update.microsoft.com testdrive.whoson.com www.columbia.k12.mo.us banners.wunderground.com .kofax.com .gotomeeting.com tools.google.com .dl.google.com .cache.googlevideo.com .gpdl.google.com .clients.google.com cache.pack.google.com kh.google.com maps.google.com auth.keyhole.com .contacts.msn.com .hrblock.com .taxcut.com .merchantadvantage.com .jtv.com .malwarebytes.org www.google-analytics.com dcs.support.xerox.com .dhl.com .webtrendslive.com javadl-esd.sun.com javadl-alt.sun.com .excelsior.edu .dhlglobalmail.com .nessus.org .foxitsoftware.com foxit.vo.llnwd.net installshield.com .mindjet.com .mediascouter.com media.us.elsevierhealth.com .xplana.com .govtrack.us sa.tulsacc.edu .omniture.com fpdownload.macromedia.com webservices.amazon.com acl password proxy_auth REQUIRED acl all src all acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8 acl SSL_ports port 443 563 631 2001 2005 8731 9001 9080 10000 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port # https, snews 443 563 acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port # unregistered ports 1936-65535 acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 10000 acl Safe_ports port 631 acl Safe_ports port 901 # SWAT acl purge method PURGE acl CONNECT method CONNECT acl UTubeUsers proxy_auth "/squid/etc/utubeusers.list" acl RestrictUTube dstdom_regex -i youtube.com acl RestrictFacebook dstdom_regex -i facebook.com acl FacebookUsers proxy_auth "/squid/etc/facebookusers.list" acl BuemerKEC src 10.10.128.0/24 acl MBSsortnet src 10.10.128.0/26 acl MSNExplorer browser -i MSN acl Printers src <HIDDEN FOR SECURITY> acl SpecialFolks src <HIDDEN FOR SECURITY> # streaming download acl fails rep_mime_type ^.*mms.* acl fails rep_mime_type ^.*ms-hdr.* acl fails rep_mime_type ^.*x-fcs.* acl fails rep_mime_type ^.*x-ms-asf.* acl fails2 urlpath_regex dvrplayer mediastream mms:// acl fails2 urlpath_regex \.asf$ \.afx$ \.flv$ \.swf$ acl deny_rep_mime_flashvideo rep_mime_type -i video/flv acl deny_rep_mime_shockwave rep_mime_type -i ^application/x-shockwave-flash$ acl x-type req_mime_type -i ^application/octet-stream$ acl x-type req_mime_type -i application/octet-stream acl x-type req_mime_type -i ^application/x-mplayer2$ acl x-type req_mime_type -i application/x-mplayer2 acl x-type req_mime_type -i ^application/x-oleobject$ acl x-type req_mime_type -i application/x-oleobject acl x-type req_mime_type -i application/x-pncmd acl x-type req_mime_type -i ^video/x-ms-asf$ acl x-type2 rep_mime_type -i ^application/octet-stream$ acl x-type2 rep_mime_type -i application/octet-stream acl x-type2 rep_mime_type -i ^application/x-mplayer2$ acl x-type2 rep_mime_type -i application/x-mplayer2 acl x-type2 rep_mime_type -i ^application/x-oleobject$ acl x-type2 rep_mime_type -i application/x-oleobject acl x-type2 rep_mime_type -i application/x-pncmd acl x-type2 rep_mime_type -i ^video/x-ms-asf$ acl RestrictHulu dstdom_regex -i hulu.com acl broken dstdomain cms.montgomerycollege.edu events.columbiamochamber.com members.columbiamochamber.com public.genexusserver.com acl RestrictVimeo dstdom_regex -i vimeo.com acl http_port port 80 #http_reply_access deny deny_rep_mime_flashvideo #http_reply_access deny deny_rep_mime_shockwave #streaming files #http_access deny fails #http_reply_access deny fails #http_access deny fails2 #http_reply_access deny fails2 #http_access deny x-type #http_reply_access deny x-type #http_access deny x-type2 #http_reply_access deny x-type2 follow_x_forwarded_for allow localhost acl_uses_indirect_client on log_uses_indirect_client on http_access allow manager localhost http_access deny manager http_access allow purge localhost http_access deny purge http_access allow SpecialFolks http_access deny CONNECT !SSL_ports http_access allow whsws http_access allow opensites http_access deny BuemerKEC !MBSsortnet http_access deny BrkRm RestrictUTube RestrictFacebook RestrictVimeo http_access allow RestrictUTube UTubeUsers http_access deny RestrictUTube http_access allow RestrictFacebook FacebookUsers http_access deny RestrictFacebook http_access deny RestrictHulu http_access allow NoAuthNec http_access allow BrkRm http_access allow FacebookUsers RestrictVimeo http_access deny RestrictVimeo http_access allow Comps http_access allow Dials http_access allow Printers http_access allow password http_access deny !Safe_ports http_access deny SSL_ports !CONNECT http_access allow http_port http_access deny all http_reply_access allow all icp_access allow all access_log /squid/var/logs/access.log squid visible_hostname proxy.site.com forwarded_for off coredump_dir /squid/cache/ #header_access Accept-Encoding deny broken #acl snmppublic snmp_community mysecretcommunity #snmp_port 3401 #snmp_access allow snmppublic all cache_mem 3 GB #acl snmppublic snmp_community mbssquid #snmp_port 3401 #snmp_access allow snmppublic all

    Read the article

  • The Virtues and Challenges of Implementing Basel III: What Every CFO and CRO Needs To Know

    - by Jenna Danko
    The Basel Committee on Banking Supervision (BCBS) is a group tasked with providing thought-leadership to the global banking industry.  Over the years, the BCBS has released volumes of guidance in an effort to promote stability within the financial sector.  By effectively communicating best-practices, the Basel Committee has influenced financial regulations worldwide.  Basel regulations are intended to help banks: More easily absorb shocks due to various forms of financial-economic stress Improve risk management and governance Enhance regulatory reporting and transparency In June 2011, the BCBS released Basel III: A global regulatory framework for more resilient banks and banking systems.  This new set of regulations included many enhancements to previous rules and will have both short and long term impacts on the banking industry.  Some of the key features of Basel III include: A stronger capital base More stringent capital standards and higher capital requirements Introduction of capital buffers  Additional risk coverage Enhanced quantification of counterparty credit risk Credit valuation adjustments  Wrong  way risk  Asset Value Correlation Multiplier for large financial institutions Liquidity management and monitoring Introduction of leverage ratio Even more rigorous data requirements To implement these features banks need to embark on a journey replete with challenges. These can be categorized into three key areas: Data, Models and Compliance. Data Challenges Data quality - All standard dimensions of Data Quality (DQ) have to be demonstrated.  Manual approaches are now considered too cumbersome and automation has become the norm. Data lineage - Data lineage has to be documented and demonstrated.  The PPT / Excel approach to documentation is being replaced by metadata tools.  Data lineage has become dynamic due to a variety of factors, making static documentation out-dated quickly.  Data dictionaries - A strong and clean business glossary is needed with proper identification of business owners for the data.  Data integrity - A strong, scalable architecture with work flow tools helps demonstrate data integrity.  Manual touch points have to be minimized.   Data relevance/coverage - Data must be relevant to all portfolios and storage devices must allow for sufficient data retention.  Coverage of both on and off balance sheet exposures is critical.   Model Challenges Model development - Requires highly trained resources with both quantitative and subject matter expertise. Model validation - All Basel models need to be validated. This requires additional resources with skills that may not be readily available in the marketplace.  Model documentation - All models need to be adequately documented.  Creation of document templates and model development processes/procedures is key. Risk and finance integration - This integration is necessary for Basel as the Allowance for Loan and Lease Losses (ALLL) is calculated by Finance, yet Expected Loss (EL) is calculated by Risk Management – and they need to somehow be equal.  This is tricky at best from an implementation perspective.  Compliance Challenges Rules interpretation - Some Basel III requirements leave room for interpretation.  A misinterpretation of regulations can lead to delays in Basel compliance and undesired reprimands from supervisory authorities. Gap identification and remediation - Internal identification and remediation of gaps ensures smoother Basel compliance and audit processes.  However business lines are challenged by the competing priorities which arise from regulatory compliance and business as usual work.  Qualification readiness - Providing internal and external auditors with robust evidence of a thorough examination of the readiness to proceed to parallel run and Basel qualification  In light of new regulations like Basel III and local variations such as the Dodd Frank Act (DFA) and Comprehensive Capital Analysis and Review (CCAR) in the US, banks are now forced to ask themselves many difficult questions.  For example, executives must consider: How will Basel III play into their Risk Appetite? How will they create project plans for Basel III when they haven’t yet finished implementing Basel II? How will new regulations impact capital structure including profitability and capital distributions to shareholders? After all, new regulations often lead to diminished profitability as well as an assortment of implementation problems as we discussed earlier in this note.  However, by requiring banks to focus on premium growth, regulators increase the potential for long-term profitability and sustainability.  And a more stable banking system: Increases consumer confidence which in turn supports banking activity  Ensures that adequate funding is available for individuals and companies Puts regulators at ease, allowing bankers to focus on banking Stability is intended to bring long-term profitability to banks.  Therefore, it is important that every banking institution takes the steps necessary to properly manage, monitor and disclose its risks.  This can be done with the assistance and oversight of an independent regulatory authority.  A spectrum of banks exist today wherein some continue to debate and negotiate with regulators over the implementation of new requirements, while others are simply choosing to embrace them for the benefits I highlighted above. Do share with me how your institution is coping with and embracing these new regulations within your bank. Dr. Varun Agarwal is a Principal in the Banking Practice for Capgemini Financial Services.  He has over 19 years experience in areas that span from enterprise risk management, credit, market, and to country risk management; financial modeling and valuation; and international financial markets research and analyses.

    Read the article

  • Toorcon 15 (2013)

    - by danx
    The Toorcon gang (senior staff): h1kari (founder), nfiltr8, and Geo Introduction to Toorcon 15 (2013) A Tale of One Software Bypass of MS Windows 8 Secure Boot Breaching SSL, One Byte at a Time Running at 99%: Surviving an Application DoS Security Response in the Age of Mass Customized Attacks x86 Rewriting: Defeating RoP and other Shinanighans Clowntown Express: interesting bugs and running a bug bounty program Active Fingerprinting of Encrypted VPNs Making Attacks Go Backwards Mask Your Checksums—The Gorry Details Adventures with weird machines thirty years after "Reflections on Trusting Trust" Introduction to Toorcon 15 (2013) Toorcon 15 is the 15th annual security conference held in San Diego. I've attended about a third of them and blogged about previous conferences I attended here starting in 2003. As always, I've only summarized the talks I attended and interested me enough to write about them. Be aware that I may have misrepresented the speaker's remarks and that they are not my remarks or opinion, or those of my employer, so don't quote me or them. Those seeking further details may contact the speakers directly or use The Google. For some talks, I have a URL for further information. A Tale of One Software Bypass of MS Windows 8 Secure Boot Andrew Furtak and Oleksandr Bazhaniuk Yuri Bulygin, Oleksandr ("Alex") Bazhaniuk, and (not present) Andrew Furtak Yuri and Alex talked about UEFI and Bootkits and bypassing MS Windows 8 Secure Boot, with vendor recommendations. They previously gave this talk at the BlackHat 2013 conference. MS Windows 8 Secure Boot Overview UEFI (Unified Extensible Firmware Interface) is interface between hardware and OS. UEFI is processor and architecture independent. Malware can replace bootloader (bootx64.efi, bootmgfw.efi). Once replaced can modify kernel. Trivial to replace bootloader. Today many legacy bootkits—UEFI replaces them most of them. MS Windows 8 Secure Boot verifies everything you load, either through signatures or hashes. UEFI firmware relies on secure update (with signed update). You would think Secure Boot would rely on ROM (such as used for phones0, but you can't do that for PCs—PCs use writable memory with signatures DXE core verifies the UEFI boat loader(s) OS Loader (winload.efi, winresume.efi) verifies the OS kernel A chain of trust is established with a root key (Platform Key, PK), which is a cert belonging to the platform vendor. Key Exchange Keys (KEKs) verify an "authorized" database (db), and "forbidden" database (dbx). X.509 certs with SHA-1/SHA-256 hashes. Keys are stored in non-volatile (NV) flash-based NVRAM. Boot Services (BS) allow adding/deleting keys (can't be accessed once OS starts—which uses Run-Time (RT)). Root cert uses RSA-2048 public keys and PKCS#7 format signatures. SecureBoot — enable disable image signature checks SetupMode — update keys, self-signed keys, and secure boot variables CustomMode — allows updating keys Secure Boot policy settings are: always execute, never execute, allow execute on security violation, defer execute on security violation, deny execute on security violation, query user on security violation Attacking MS Windows 8 Secure Boot Secure Boot does NOT protect from physical access. Can disable from console. Each BIOS vendor implements Secure Boot differently. There are several platform and BIOS vendors. It becomes a "zoo" of implementations—which can be taken advantage of. Secure Boot is secure only when all vendors implement it correctly. Allow only UEFI firmware signed updates protect UEFI firmware from direct modification in flash memory protect FW update components program SPI controller securely protect secure boot policy settings in nvram protect runtime api disable compatibility support module which allows unsigned legacy Can corrupt the Platform Key (PK) EFI root certificate variable in SPI flash. If PK is not found, FW enters setup mode wich secure boot turned off. Can also exploit TPM in a similar manner. One is not supposed to be able to directly modify the PK in SPI flash from the OS though. But they found a bug that they can exploit from User Mode (undisclosed) and demoed the exploit. It loaded and ran their own bootkit. The exploit requires a reboot. Multiple vendors are vulnerable. They will disclose this exploit to vendors in the future. Recommendations: allow only signed updates protect UEFI fw in ROM protect EFI variable store in ROM Breaching SSL, One Byte at a Time Yoel Gluck and Angelo Prado Angelo Prado and Yoel Gluck, Salesforce.com CRIME is software that performs a "compression oracle attack." This is possible because the SSL protocol doesn't hide length, and because SSL compresses the header. CRIME requests with every possible character and measures the ciphertext length. Look for the plaintext which compresses the most and looks for the cookie one byte-at-a-time. SSL Compression uses LZ77 to reduce redundancy. Huffman coding replaces common byte sequences with shorter codes. US CERT thinks the SSL compression problem is fixed, but it isn't. They convinced CERT that it wasn't fixed and they issued a CVE. BREACH, breachattrack.com BREACH exploits the SSL response body (Accept-Encoding response, Content-Encoding). It takes advantage of the fact that the response is not compressed. BREACH uses gzip and needs fairly "stable" pages that are static for ~30 seconds. It needs attacker-supplied content (say from a web form or added to a URL parameter). BREACH listens to a session's requests and responses, then inserts extra requests and responses. Eventually, BREACH guesses a session's secret key. Can use compression to guess contents one byte at-a-time. For example, "Supersecret SupersecreX" (a wrong guess) compresses 10 bytes, and "Supersecret Supersecret" (a correct guess) compresses 11 bytes, so it can find each character by guessing every character. To start the guess, BREACH needs at least three known initial characters in the response sequence. Compression length then "leaks" information. Some roadblocks include no winners (all guesses wrong) or too many winners (multiple possibilities that compress the same). The solutions include: lookahead (guess 2 or 3 characters at-a-time instead of 1 character). Expensive rollback to last known conflict check compression ratio can brute-force first 3 "bootstrap" characters, if needed (expensive) block ciphers hide exact plain text length. Solution is to align response in advance to block size Mitigations length: use variable padding secrets: dynamic CSRF tokens per request secret: change over time separate secret to input-less servlets Future work eiter understand DEFLATE/GZIP HTTPS extensions Running at 99%: Surviving an Application DoS Ryan Huber Ryan Huber, Risk I/O Ryan first discussed various ways to do a denial of service (DoS) attack against web services. One usual method is to find a slow web page and do several wgets. Or download large files. Apache is not well suited at handling a large number of connections, but one can put something in front of it Can use Apache alternatives, such as nginx How to identify malicious hosts short, sudden web requests user-agent is obvious (curl, python) same url requested repeatedly no web page referer (not normal) hidden links. hide a link and see if a bot gets it restricted access if not your geo IP (unless the website is global) missing common headers in request regular timing first seen IP at beginning of attack count requests per hosts (usually a very large number) Use of captcha can mitigate attacks, but you'll lose a lot of genuine users. Bouncer, goo.gl/c2vyEc and www.github.com/rawdigits/Bouncer Bouncer is software written by Ryan in netflow. Bouncer has a small, unobtrusive footprint and detects DoS attempts. It closes blacklisted sockets immediately (not nice about it, no proper close connection). Aggregator collects requests and controls your web proxies. Need NTP on the front end web servers for clean data for use by bouncer. Bouncer is also useful for a popularity storm ("Slashdotting") and scraper storms. Future features: gzip collection data, documentation, consumer library, multitask, logging destroyed connections. Takeaways: DoS mitigation is easier with a complete picture Bouncer designed to make it easier to detect and defend DoS—not a complete cure Security Response in the Age of Mass Customized Attacks Peleus Uhley and Karthik Raman Peleus Uhley and Karthik Raman, Adobe ASSET, blogs.adobe.com/asset/ Peleus and Karthik talked about response to mass-customized exploits. Attackers behave much like a business. "Mass customization" refers to concept discussed in the book Future Perfect by Stan Davis of Harvard Business School. Mass customization is differentiating a product for an individual customer, but at a mass production price. For example, the same individual with a debit card receives basically the same customized ATM experience around the world. Or designing your own PC from commodity parts. Exploit kits are another example of mass customization. The kits support multiple browsers and plugins, allows new modules. Exploit kits are cheap and customizable. Organized gangs use exploit kits. A group at Berkeley looked at 77,000 malicious websites (Grier et al., "Manufacturing Compromise: The Emergence of Exploit-as-a-Service", 2012). They found 10,000 distinct binaries among them, but derived from only a dozen or so exploit kits. Characteristics of Mass Malware: potent, resilient, relatively low cost Technical characteristics: multiple OS, multipe payloads, multiple scenarios, multiple languages, obfuscation Response time for 0-day exploits has gone down from ~40 days 5 years ago to about ~10 days now. So the drive with malware is towards mass customized exploits, to avoid detection There's plenty of evicence that exploit development has Project Manager bureaucracy. They infer from the malware edicts to: support all versions of reader support all versions of windows support all versions of flash support all browsers write large complex, difficult to main code (8750 lines of JavaScript for example Exploits have "loose coupling" of multipe versions of software (adobe), OS, and browser. This allows specific attacks against specific versions of multiple pieces of software. Also allows exploits of more obscure software/OS/browsers and obscure versions. Gave examples of exploits that exploited 2, 3, 6, or 14 separate bugs. However, these complete exploits are more likely to be buggy or fragile in themselves and easier to defeat. Future research includes normalizing malware and Javascript. Conclusion: The coming trend is that mass-malware with mass zero-day attacks will result in mass customization of attacks. x86 Rewriting: Defeating RoP and other Shinanighans Richard Wartell Richard Wartell The attack vector we are addressing here is: First some malware causes a buffer overflow. The malware has no program access, but input access and buffer overflow code onto stack Later the stack became non-executable. The workaround malware used was to write a bogus return address to the stack jumping to malware Later came ASLR (Address Space Layout Randomization) to randomize memory layout and make addresses non-deterministic. The workaround malware used was to jump t existing code segments in the program that can be used in bad ways "RoP" is Return-oriented Programming attacks. RoP attacks use your own code and write return address on stack to (existing) expoitable code found in program ("gadgets"). Pinkie Pie was paid $60K last year for a RoP attack. One solution is using anti-RoP compilers that compile source code with NO return instructions. ASLR does not randomize address space, just "gadgets". IPR/ILR ("Instruction Location Randomization") randomizes each instruction with a virtual machine. Richard's goal was to randomize a binary with no source code access. He created "STIR" (Self-Transofrming Instruction Relocation). STIR disassembles binary and operates on "basic blocks" of code. The STIR disassembler is conservative in what to disassemble. Each basic block is moved to a random location in memory. Next, STIR writes new code sections with copies of "basic blocks" of code in randomized locations. The old code is copied and rewritten with jumps to new code. the original code sections in the file is marked non-executible. STIR has better entropy than ASLR in location of code. Makes brute force attacks much harder. STIR runs on MS Windows (PEM) and Linux (ELF). It eliminated 99.96% or more "gadgets" (i.e., moved the address). Overhead usually 5-10% on MS Windows, about 1.5-4% on Linux (but some code actually runs faster!). The unique thing about STIR is it requires no source access and the modified binary fully works! Current work is to rewrite code to enforce security policies. For example, don't create a *.{exe,msi,bat} file. Or don't connect to the network after reading from the disk. Clowntown Express: interesting bugs and running a bug bounty program Collin Greene Collin Greene, Facebook Collin talked about Facebook's bug bounty program. Background at FB: FB has good security frameworks, such as security teams, external audits, and cc'ing on diffs. But there's lots of "deep, dark, forgotten" parts of legacy FB code. Collin gave several examples of bountied bugs. Some bounty submissions were on software purchased from a third-party (but bounty claimers don't know and don't care). We use security questions, as does everyone else, but they are basically insecure (often easily discoverable). Collin didn't expect many bugs from the bounty program, but they ended getting 20+ good bugs in first 24 hours and good submissions continue to come in. Bug bounties bring people in with different perspectives, and are paid only for success. Bug bounty is a better use of a fixed amount of time and money versus just code review or static code analysis. The Bounty program started July 2011 and paid out $1.5 million to date. 14% of the submissions have been high priority problems that needed to be fixed immediately. The best bugs come from a small % of submitters (as with everything else)—the top paid submitters are paid 6 figures a year. Spammers like to backstab competitors. The youngest sumitter was 13. Some submitters have been hired. Bug bounties also allows to see bugs that were missed by tools or reviews, allowing improvement in the process. Bug bounties might not work for traditional software companies where the product has release cycle or is not on Internet. Active Fingerprinting of Encrypted VPNs Anna Shubina Anna Shubina, Dartmouth Institute for Security, Technology, and Society (I missed the start of her talk because another track went overtime. But I have the DVD of the talk, so I'll expand later) IPsec leaves fingerprints. Using netcat, one can easily visually distinguish various crypto chaining modes just from packet timing on a chart (example, DES-CBC versus AES-CBC) One can tell a lot about VPNs just from ping roundtrips (such as what router is used) Delayed packets are not informative about a network, especially if far away from the network More needed to explore about how TCP works in real life with respect to timing Making Attacks Go Backwards Fuzzynop FuzzyNop, Mandiant This talk is not about threat attribution (finding who), product solutions, politics, or sales pitches. But who are making these malware threats? It's not a single person or group—they have diverse skill levels. There's a lot of fat-fingered fumblers out there. Always look for low-hanging fruit first: "hiding" malware in the temp, recycle, or root directories creation of unnamed scheduled tasks obvious names of files and syscalls ("ClearEventLog") uncleared event logs. Clearing event log in itself, and time of clearing, is a red flag and good first clue to look for on a suspect system Reverse engineering is hard. Disassembler use takes practice and skill. A popular tool is IDA Pro, but it takes multiple interactive iterations to get a clean disassembly. Key loggers are used a lot in targeted attacks. They are typically custom code or built in a backdoor. A big tip-off is that non-printable characters need to be printed out (such as "[Ctrl]" "[RightShift]") or time stamp printf strings. Look for these in files. Presence is not proof they are used. Absence is not proof they are not used. Java exploits. Can parse jar file with idxparser.py and decomile Java file. Java typially used to target tech companies. Backdoors are the main persistence mechanism (provided externally) for malware. Also malware typically needs command and control. Application of Artificial Intelligence in Ad-Hoc Static Code Analysis John Ashaman John Ashaman, Security Innovation Initially John tried to analyze open source files with open source static analysis tools, but these showed thousands of false positives. Also tried using grep, but tis fails to find anything even mildly complex. So next John decided to write his own tool. His approach was to first generate a call graph then analyze the graph. However, the problem is that making a call graph is really hard. For example, one problem is "evil" coding techniques, such as passing function pointer. First the tool generated an Abstract Syntax Tree (AST) with the nodes created from method declarations and edges created from method use. Then the tool generated a control flow graph with the goal to find a path through the AST (a maze) from source to sink. The algorithm is to look at adjacent nodes to see if any are "scary" (a vulnerability), using heuristics for search order. The tool, called "Scat" (Static Code Analysis Tool), currently looks for C# vulnerabilities and some simple PHP. Later, he plans to add more PHP, then JSP and Java. For more information see his posts in Security Innovation blog and NRefactory on GitHub. Mask Your Checksums—The Gorry Details Eric (XlogicX) Davisson Eric (XlogicX) Davisson Sometimes in emailing or posting TCP/IP packets to analyze problems, you may want to mask the IP address. But to do this correctly, you need to mask the checksum too, or you'll leak information about the IP. Problem reports found in stackoverflow.com, sans.org, and pastebin.org are usually not masked, but a few companies do care. If only the IP is masked, the IP may be guessed from checksum (that is, it leaks data). Other parts of packet may leak more data about the IP. TCP and IP checksums both refer to the same data, so can get more bits of information out of using both checksums than just using one checksum. Also, one can usually determine the OS from the TTL field and ports in a packet header. If we get hundreds of possible results (16x each masked nibble that is unknown), one can do other things to narrow the results, such as look at packet contents for domain or geo information. With hundreds of results, can import as CSV format into a spreadsheet. Can corelate with geo data and see where each possibility is located. Eric then demoed a real email report with a masked IP packet attached. Was able to find the exact IP address, given the geo and university of the sender. Point is if you're going to mask a packet, do it right. Eric wouldn't usually bother, but do it correctly if at all, to not create a false impression of security. Adventures with weird machines thirty years after "Reflections on Trusting Trust" Sergey Bratus Sergey Bratus, Dartmouth College (and Julian Bangert and Rebecca Shapiro, not present) "Reflections on Trusting Trust" refers to Ken Thompson's classic 1984 paper. "You can't trust code that you did not totally create yourself." There's invisible links in the chain-of-trust, such as "well-installed microcode bugs" or in the compiler, and other planted bugs. Thompson showed how a compiler can introduce and propagate bugs in unmodified source. But suppose if there's no bugs and you trust the author, can you trust the code? Hell No! There's too many factors—it's Babylonian in nature. Why not? Well, Input is not well-defined/recognized (code's assumptions about "checked" input will be violated (bug/vunerabiliy). For example, HTML is recursive, but Regex checking is not recursive. Input well-formed but so complex there's no telling what it does For example, ELF file parsing is complex and has multiple ways of parsing. Input is seen differently by different pieces of program or toolchain Any Input is a program input executes on input handlers (drives state changes & transitions) only a well-defined execution model can be trusted (regex/DFA, PDA, CFG) Input handler either is a "recognizer" for the inputs as a well-defined language (see langsec.org) or it's a "virtual machine" for inputs to drive into pwn-age ELF ABI (UNIX/Linux executible file format) case study. Problems can arise from these steps (without planting bugs): compiler linker loader ld.so/rtld relocator DWARF (debugger info) exceptions The problem is you can't really automatically analyze code (it's the "halting problem" and undecidable). Only solution is to freeze code and sign it. But you can't freeze everything! Can't freeze ASLR or loading—must have tables and metadata. Any sufficiently complex input data is the same as VM byte code Example, ELF relocation entries + dynamic symbols == a Turing Complete Machine (TM). @bxsays created a Turing machine in Linux from relocation data (not code) in an ELF file. For more information, see Rebecca "bx" Shapiro's presentation from last year's Toorcon, "Programming Weird Machines with ELF Metadata" @bxsays did same thing with Mach-O bytecode Or a DWARF exception handling data .eh_frame + glibc == Turning Machine X86 MMU (IDT, GDT, TSS): used address translation to create a Turning Machine. Page handler reads and writes (on page fault) memory. Uses a page table, which can be used as Turning Machine byte code. Example on Github using this TM that will fly a glider across the screen Next Sergey talked about "Parser Differentials". That having one input format, but two parsers, will create confusion and opportunity for exploitation. For example, CSRs are parsed during creation by cert requestor and again by another parser at the CA. Another example is ELF—several parsers in OS tool chain, which are all different. Can have two different Program Headers (PHDRs) because ld.so parses multiple PHDRs. The second PHDR can completely transform the executable. This is described in paper in the first issue of International Journal of PoC. Conclusions trusting computers not only about bugs! Bugs are part of a problem, but no by far all of it complex data formats means bugs no "chain of trust" in Babylon! (that is, with parser differentials) we need to squeeze complexity out of data until data stops being "code equivalent" Further information See and langsec.org. USENIX WOOT 2013 (Workshop on Offensive Technologies) for "weird machines" papers and videos.

    Read the article

1