Search Results

Search found 3920 results on 157 pages for 'dave mess'.

Page 157/157 | < Previous Page | 153 154 155 156 157

Why is Java EE 6 better than Spring ?

- by arungupta

Java EE 6 was released over 2 years ago and now there are 14 compliant application servers. In all my talks around the world, a question that is frequently asked is Why should I use Java EE 6 instead of Spring ? There are already several blogs covering that topic: Java EE wins over Spring by Bill Burke Why will I use Java EE instead of Spring in new Enterprise Java projects in 2012 ? by Kai Waehner (more discussion on TSS) Spring to Java EE migration (Part 1 and 2, 3 and 4 coming as well) by David Heffelfinger Spring to Java EE - A Migration Experience by Lincoln Baxter Migrating Spring to Java EE 6 by Bert Ertman and Paul Bakker at NLJUG Moving from Spring to Java EE 6 - The Age of Frameworks is Over at TSS Java EE vs Spring Shootout by Rohit Kelapure and Reza Rehman at JavaOne 2011 Java EE 6 and the Ewoks by Murat Yener Definite excuse to avoid Spring forever - Bert Ertman and Arun Gupta I will try to share my perspective in this blog. First of all, I'd like to start with a note: Thank you Spring framework for filling the interim gap and providing functionality that is now included in the mainstream Java EE 6 application servers. The Java EE platform has evolved over the years learning from frameworks like Spring and provides all the functionality to build an enterprise application. Thank you very much Spring framework! While Spring was revolutionary in its time and is still very popular and quite main stream in the same way Struts was circa 2003, it really is last generation's framework - some people are even calling it legacy. However my theory is "code is king". So my approach is to build/take a simple Hello World CRUD application in Java EE 6 and Spring and compare the deployable artifacts. I started looking at the official tutorial Developing a Spring Framework MVC Application Step-by-Step but it is using the older version 2.5. I wasn't able to find any updated version in the current 3.1 release. Next, I downloaded Spring Tool Suite and thought that would provide some template samples to get started. A least a quick search did not show any handy tutorials - either video or text-based. So I searched and found a link to their SVN repository at src.springframework.org/svn/spring-samples/. I tried the "mvc-basic" sample and the generated WAR file was 4.43 MB. While it was named a "basic" sample it seemed to come with 19 different libraries bundled but it was what I could find: ./WEB-INF/lib/aopalliance-1.0.jar./WEB-INF/lib/hibernate-validator-4.1.0.Final.jar./WEB-INF/lib/jcl-over-slf4j-1.6.1.jar./WEB-INF/lib/joda-time-1.6.2.jar./WEB-INF/lib/joda-time-jsptags-1.0.2.jar./WEB-INF/lib/jstl-1.2.jar./WEB-INF/lib/log4j-1.2.16.jar./WEB-INF/lib/slf4j-api-1.6.1.jar./WEB-INF/lib/slf4j-log4j12-1.6.1.jar./WEB-INF/lib/spring-aop-3.0.5.RELEASE.jar./WEB-INF/lib/spring-asm-3.0.5.RELEASE.jar./WEB-INF/lib/spring-beans-3.0.5.RELEASE.jar./WEB-INF/lib/spring-context-3.0.5.RELEASE.jar./WEB-INF/lib/spring-context-support-3.0.5.RELEASE.jar./WEB-INF/lib/spring-core-3.0.5.RELEASE.jar./WEB-INF/lib/spring-expression-3.0.5.RELEASE.jar./WEB-INF/lib/spring-web-3.0.5.RELEASE.jar./WEB-INF/lib/spring-webmvc-3.0.5.RELEASE.jar./WEB-INF/lib/validation-api-1.0.0.GA.jar And it is not even using any database! The app deployed fine on GlassFish 3.1.2 but the "@Controller Example" link did not work as it was missing the context root. With a bit of tweaking I could deploy the application and assume that the account got created because no error was displayed in the browser or server log. Next I generated the WAR for "mvc-ajax" and the 5.1 MB WAR had 20 JARs (1 removed, 2 added): ./WEB-INF/lib/aopalliance-1.0.jar./WEB-INF/lib/hibernate-validator-4.1.0.Final.jar./WEB-INF/lib/jackson-core-asl-1.6.4.jar./WEB-INF/lib/jackson-mapper-asl-1.6.4.jar./WEB-INF/lib/jcl-over-slf4j-1.6.1.jar./WEB-INF/lib/joda-time-1.6.2.jar./WEB-INF/lib/jstl-1.2.jar./WEB-INF/lib/log4j-1.2.16.jar./WEB-INF/lib/slf4j-api-1.6.1.jar./WEB-INF/lib/slf4j-log4j12-1.6.1.jar./WEB-INF/lib/spring-aop-3.0.5.RELEASE.jar./WEB-INF/lib/spring-asm-3.0.5.RELEASE.jar./WEB-INF/lib/spring-beans-3.0.5.RELEASE.jar./WEB-INF/lib/spring-context-3.0.5.RELEASE.jar./WEB-INF/lib/spring-context-support-3.0.5.RELEASE.jar./WEB-INF/lib/spring-core-3.0.5.RELEASE.jar./WEB-INF/lib/spring-expression-3.0.5.RELEASE.jar./WEB-INF/lib/spring-web-3.0.5.RELEASE.jar./WEB-INF/lib/spring-webmvc-3.0.5.RELEASE.jar./WEB-INF/lib/validation-api-1.0.0.GA.jar 2 more JARs for just doing Ajax. Anyway, deploying this application gave the following error: Caused by: java.lang.NoSuchMethodError: org.codehaus.jackson.map.SerializationConfig.<init>(Lorg/codehaus/jackson/map/ClassIntrospector;Lorg/codehaus/jackson/map/AnnotationIntrospector;Lorg/codehaus/jackson/map/introspect/VisibilityChecker;Lorg/codehaus/jackson/map/jsontype/SubtypeResolver;)V at org.springframework.samples.mvc.ajax.json.ConversionServiceAwareObjectMapper.<init>(ConversionServiceAwareObjectMapper.java:20) at org.springframework.samples.mvc.ajax.json.JacksonConversionServiceConfigurer.postProcessAfterInitialization(JacksonConversionServiceConfigurer.java:40) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsAfterInitialization(AbstractAutowireCapableBeanFactory.java:407) Seems like some incorrect repos in the "pom.xml". Next one is "mvc-showcase" and the 6.49 MB WAR now has 28 JARs as shown below: ./WEB-INF/lib/aopalliance-1.0.jar./WEB-INF/lib/aspectjrt-1.6.10.jar./WEB-INF/lib/commons-fileupload-1.2.2.jar./WEB-INF/lib/commons-io-2.0.1.jar./WEB-INF/lib/el-api-2.2.jar./WEB-INF/lib/hibernate-validator-4.1.0.Final.jar./WEB-INF/lib/jackson-core-asl-1.8.1.jar./WEB-INF/lib/jackson-mapper-asl-1.8.1.jar./WEB-INF/lib/javax.inject-1.jar./WEB-INF/lib/jcl-over-slf4j-1.6.1.jar./WEB-INF/lib/jdom-1.0.jar./WEB-INF/lib/joda-time-1.6.2.jar./WEB-INF/lib/jstl-api-1.2.jar./WEB-INF/lib/jstl-impl-1.2.jar./WEB-INF/lib/log4j-1.2.16.jar./WEB-INF/lib/rome-1.0.0.jar./WEB-INF/lib/slf4j-api-1.6.1.jar./WEB-INF/lib/slf4j-log4j12-1.6.1.jar./WEB-INF/lib/spring-aop-3.1.0.RELEASE.jar./WEB-INF/lib/spring-asm-3.1.0.RELEASE.jar./WEB-INF/lib/spring-beans-3.1.0.RELEASE.jar./WEB-INF/lib/spring-context-3.1.0.RELEASE.jar./WEB-INF/lib/spring-context-support-3.1.0.RELEASE.jar./WEB-INF/lib/spring-core-3.1.0.RELEASE.jar./WEB-INF/lib/spring-expression-3.1.0.RELEASE.jar./WEB-INF/lib/spring-web-3.1.0.RELEASE.jar./WEB-INF/lib/spring-webmvc-3.1.0.RELEASE.jar./WEB-INF/lib/validation-api-1.0.0.GA.jar The app at least deployed and showed results this time. But still no database! Next I tried building "jpetstore" and got the error: [ERROR] Failed to execute goal on project org.springframework.samples.jpetstore:Could not resolve dependencies for project org.springframework.samples:org.springframework.samples.jpetstore:war:1.0.0-SNAPSHOT: Failed to collect dependencies for [commons-fileupload:commons-fileupload:jar:1.2.1 (compile), org.apache.struts:com.springsource.org.apache.struts:jar:1.2.9 (compile), javax.xml.rpc:com.springsource.javax.xml.rpc:jar:1.1.0 (compile), org.apache.commons:com.springsource.org.apache.commons.dbcp:jar:1.2.2.osgi (compile), commons-io:commons-io:jar:1.3.2 (compile), hsqldb:hsqldb:jar:1.8.0.7 (compile), org.apache.tiles:tiles-core:jar:2.2.0 (compile), org.apache.tiles:tiles-jsp:jar:2.2.0 (compile), org.tuckey:urlrewritefilter:jar:3.1.0 (compile), org.springframework:spring-webmvc:jar:3.0.0.BUILD-SNAPSHOT (compile), org.springframework:spring-orm:jar:3.0.0.BUILD-SNAPSHOT (compile), org.springframework:spring-context-support:jar:3.0.0.BUILD-SNAPSHOT (compile), org.springframework.webflow:spring-js:jar:2.0.7.RELEASE (compile), org.apache.ibatis:com.springsource.com.ibatis:jar:2.3.4.726 (runtime), com.caucho:com.springsource.com.caucho:jar:3.2.1 (compile), org.apache.axis:com.springsource.org.apache.axis:jar:1.4.0 (compile), javax.wsdl:com.springsource.javax.wsdl:jar:1.6.1 (compile), javax.servlet:jstl:jar:1.2 (runtime), org.aspectj:aspectjweaver:jar:1.6.5 (compile), javax.servlet:servlet-api:jar:2.5 (provided), javax.servlet.jsp:jsp-api:jar:2.1 (provided), junit:junit:jar:4.6 (test)]: Failed to read artifact descriptor for org.springframework:spring-webmvc:jar:3.0.0.BUILD-SNAPSHOT: Could not transfer artifact org.springframework:spring-webmvc:pom:3.0.0.BUILD-SNAPSHOT from/to JBoss repository (http://repository.jboss.com/maven2): Access denied to: http://repository.jboss.com/maven2/org/springframework/spring-webmvc/3.0.0.BUILD-SNAPSHOT/spring-webmvc-3.0.0.BUILD-SNAPSHOT.pom It appears the sample is broken - maybe I was pulling from the wrong repository - would be great if someone were to point me at a good target to use here. With a 50% hit on samples in this repository, I started searching through numerous blogs, most of which have either outdated information (using XML-heavy Spring 2.5), some piece of configuration (which is a typical "feature" of Spring) is missing, or too much complexity in the sample. I finally found this blog that worked like a charm. This blog creates a trivial Spring MVC 3 application using Hibernate and MySQL. This application performs CRUD operations on a single table in a database using typical Spring technologies. I downloaded the sample code from the blog, deployed it on GlassFish 3.1.2 and could CRUD the "person" entity. The source code for this application can be downloaded here. More details on the application statistics below. And then I built a similar CRUD application in Java EE 6 using NetBeans wizards in a couple of minutes. The source code for the application can be downloaded here and the WAR here. The Spring Source Tool Suite may also offer similar wizard-driven capabilities but this blog focus primarily on comparing the runtimes. The lack of STS tutorials was slightly disappointing as well. NetBeans however has tons of text-based and video tutorials and tons of material even by the community. One more bit on the download size of tools bundle ... NetBeans 7.1.1 "All" is 211 MB (which includes GlassFish and Tomcat) Spring Tool Suite 2.9.0 is 347 MB (~ 65% bigger) This blog is not about the tooling comparison so back to the Java EE 6 version of the application .... In order to run the Java EE version on GlassFish, copy the MySQL Connector/J to glassfish3/glassfish/domains/domain1/lib/ext directory and create a JDBC connection pool and JDBC resource as: ./bin/asadmin create-jdbc-connection-pool --datasourceclassname \\ com.mysql.jdbc.jdbc2.optional.MysqlDataSource --restype \\ javax.sql.DataSource --property \\ portNumber=3306:user=mysql:password=mysql:databaseName=mydatabase \\ myConnectionPool ./bin/asadmin create-jdbc-resource --connectionpoolid myConnectionPool jdbc/myDataSource I generated WARs for the two projects and the table below highlights some differences between them: Java EE 6 Spring WAR File Size 0.021030 MB 10.87 MB (~516x) Number of files 20 53 (> 2.5x) Bundled libraries 0 36 Total size of libraries 0 12.1 MB XML files 3 5 LoC in XML files 50 (11 + 15 + 24) 129 (27 + 46 + 16 + 11 + 19) (~ 2.5x) Total .properties files 1 Bundle.properties 2 spring.properties, log4j.properties Cold Deploy 5,339 ms 11,724 ms Second Deploy 481 ms 6,261 ms Third Deploy 528 ms 5,484 ms Fourth Deploy 484 ms 5,576 ms Runtime memory ~73 MB ~101 MB Some points worth highlighting from the table ... 516x WAR file, 10x deployment time - With 12.1 MB of libraries (for a very basic application) bundled in your application, the WAR file size and the deployment time will naturally go higher. The WAR file for Spring-based application is 516x bigger and the deployment time is double during the first deployment and ~ 10x during subsequent deployments. The Java EE 6 application is fully portable and will run on any Java EE 6 compliant application server. 36 libraries in the WAR - There are 14 Java EE 6 compliant application servers today. Each of those servers provide all the functionality like transactions, dependency injection, security, persistence, etc typically required of an enterprise or web application. There is no need to bundle 36 libraries worth 12.1 MB for a trivial CRUD application. These 14 compliant application servers provide all the functionality baked in. Now you can also deploy these libraries in the container but then you don't get the "portability" offered by Spring in that case. Does your typical Spring deployment actually do that ? 3x LoC in XML - The number of XML files is about 1.6x and the LoC is ~ 2.5x. So much XML seems circa 2003 when the Java language had no annotations. The XML files can be further reduced, e.g. faces-config.xml can be replaced without providing i18n, but I just want to compare stock applications. Memory usage - Both the applications were deployed on default GlassFish 3.1.2 installation and any additional memory consumed as part of deployment/access was attributed to the application. This is by no means scientific but at least provides an initial ballpark. This area definitely needs more investigation. Another table that compares typical Java EE 6 compliant application servers and the custom-stack created for a Spring application ... Java EE 6 Spring Web Container ? 53 MB (tcServer 2.6.3 Developer Edition) Security ? 12 MB (Spring Security 3.1.0) Persistence ? 6.3 MB (Hibernate 4.1.0, required) Dependency Injection ? 5.3 MB (Framework) Web Services ? 796 KB (Spring WS 2.0.4) Messaging ? 3.4 MB (RabbitMQ Server 2.7.1) 936 KB (Java client 936) OSGi ? 1.3 MB (Spring OSGi 1.2.1) GlassFish and WebLogic (starting at 33 MB) 83.3 MB There are differentiating factors on both the stacks. But most of the functionality like security, persistence, and dependency injection is baked in a Java EE 6 compliant application server but needs to be individually managed and patched for a Spring application. This very quickly leads to a "stack explosion". The Java EE 6 servers are tested extensively on a variety of platforms in different combinations whereas a Spring application developer is responsible for testing with different JDKs, Operating Systems, Versions, Patches, etc. Oracle has both the leading OSS lightweight server with GlassFish and the leading enterprise Java server with WebLogic Server, both Java EE 6 and both with lightweight deployment options. The Web Container offered as part of a Java EE 6 application server not only deploys your enterprise Java applications but also provide operational management, diagnostics, and mission-critical capabilities required by your applications. The Java EE 6 platform also introduced the Web Profile which is a subset of the specifications from the entire platform. It is targeted at developers of modern web applications offering a reasonably complete stack, composed of standard APIs, and is capable out-of-the-box of addressing the needs of a large class of Web applications. As your applications grow, the stack can grow to the full Java EE 6 platform. The GlassFish Server Web Profile starting at 33MB (smaller than just the non-standard tcServer) provides most of the functionality typically required by a web application. WebLogic provides battle-tested functionality for a high throughput, low latency, and enterprise grade web application. No individual managing or patching, all tested and commercially supported for you! Note that VMWare does have a server, tcServer, but it is non-standard and not even certified to the level of the standard Web Profile most customers expect these days. Customers who choose this risk proprietary lock-in since VMWare does not seem to want to formally certify with either Java EE 6 Enterprise Platform or with Java EE 6 Web Profile but of course it would be great if they were to join the community and help their customers reduce the risk of deploying on VMWare software. Some more points to help you decide choose between Java EE 6 and Spring ... Freedom to choose container - There are 14 Java EE 6 compliant application servers today, with a variety of open source and commercial offerings. A Java EE 6 application can be deployed on any of those containers. So if you deployed your application on GlassFish today and would like to scale up with your demands then you can deploy the same application to WebLogic. And because of the portability of a Java EE 6 application, you can even take it a different vendor altogether. Spring requires a runtime which could be any of these app servers as well. But why use Spring when all the required functionality is already baked into the application server itself ? Spring also has a different definition of portability where they claim to bundle all the libraries in the WAR file and move to any application server. But we saw earlier how bloated that archive could be. The equivalent features in Spring runtime offerings (mainly tcServer) are not all open source, not as mature, and often require manual assembly. Vendor choice - The Java EE 6 platform is created using the Java Community Process where all the big players like Oracle, IBM, RedHat, and Apache are conritbuting to make the platform successful. Each application server provides the basic Java EE 6 platform compliance and has its own competitive offerings. This allows you to choose an application server for deploying your Java EE 6 applications. If you are not happy with the support or feature of one vendor then you can move your application to a different vendor because of the portability promise offered by the platform. Spring is a set of products from a single company, one price book, one support organization, one sustaining organization, one sales organization, etc. If any of those cause a customer headache, where do you go ? Java EE, backed by multiple vendors, is a safer bet for those that are risk averse. Production support - With Spring, typically you need to get support from two vendors - VMWare and the container provider. With Java EE 6, all of this is typically provided by one vendor. For example, Oracle offers commercial support from systems, operating systems, JDK, application server, and applications on top of them. VMWare certainly offers complete production support but do you really want to put all your eggs in one basket ? Do you really use tcServer ? ;-) Maintainability - With Spring, you are likely building your own distribution with multiple JAR files, integrating, patching, versioning, etc of all those components. Spring's claim is that multiple JAR files allow you to go à la carte and pick the latest versions of different components. But who is responsible for testing whether all these versions work together ? Yep, you got it, its YOU! If something does not work, who patches and maintains the JARs ? Of course, you! Commercial support for such a configuration ? On your own! The Java EE application servers manage all of this for you and provide a well-tested and commercially supported bundle. While it is always good to realize that there is something new and improved that updates and replaces older frameworks like Spring, the good news is not only does a Java EE 6 container offer what is described here, most also will let you deploy and run your Spring applications on them while you go through an upgrade to a more modern architecture. End result, you get the best of both worlds - keeping your legacy investment but moving to a more agile, lightweight world of Java EE 6. A message to the Spring lovers ... The complexity in J2EE 1.2, 1.3, and 1.4 led to the genesis of Spring but that was in 2004. This is 2012 and the name has changed to "Java EE 6" :-) There are tons of improvements in the Java EE platform to make it easy-to-use and powerful. Some examples: Adding @Stateless on a POJO makes it an EJB EJBs can be packaged in a WAR with no special packaging or deployment descriptors "web.xml" and "faces-config.xml" are optional in most of the common cases Typesafe dependency injection is now part of the Java EE platform Add @Path on a POJO allows you to publish it as a RESTful resource EJBs can be used as backing beans for Facelets-driven JSF pages providing full MVC Java EE 6 WARs are known to be kilobytes in size and deployed in milliseconds Tons of other simplifications in the platform and application servers So if you moved away from J2EE to Spring many years ago and have not looked at Java EE 6 (which has been out since Dec 2009) then you should definitely try it out. Just be at least aware of what other alternatives are available instead of restricting yourself to one stack. Here are some workshops and screencasts worth trying: screencast #37 shows how to build an end-to-end application using NetBeans screencast #36 builds the same application using Eclipse javaee-lab-feb2012.pdf is a 3-4 hours self-paced hands-on workshop that guides you to build a comprehensive Java EE 6 application using NetBeans Each city generally has a "spring cleanup" program every year. It allows you to clean up the mess from your house. For your software projects, you don't need to wait for an annual event, just get started and reduce the technical debt now! Move away from your legacy Spring-based applications to a lighter and more modern approach of building enterprise Java applications using Java EE 6. Watch this beautiful presentation that explains how to migrate from Spring -> Java EE 6: List of files in the Java EE 6 project: ./index.xhtml./META-INF./person./person/Create.xhtml./person/Edit.xhtml./person/List.xhtml./person/View.xhtml./resources./resources/css./resources/css/jsfcrud.css./template.xhtml./WEB-INF./WEB-INF/classes./WEB-INF/classes/Bundle.properties./WEB-INF/classes/META-INF./WEB-INF/classes/META-INF/persistence.xml./WEB-INF/classes/org./WEB-INF/classes/org/javaee./WEB-INF/classes/org/javaee/javaeemysql./WEB-INF/classes/org/javaee/javaeemysql/AbstractFacade.class./WEB-INF/classes/org/javaee/javaeemysql/Person.class./WEB-INF/classes/org/javaee/javaeemysql/Person_.class./WEB-INF/classes/org/javaee/javaeemysql/PersonController$1.class./WEB-INF/classes/org/javaee/javaeemysql/PersonController$PersonControllerConverter.class./WEB-INF/classes/org/javaee/javaeemysql/PersonController.class./WEB-INF/classes/org/javaee/javaeemysql/PersonFacade.class./WEB-INF/classes/org/javaee/javaeemysql/util./WEB-INF/classes/org/javaee/javaeemysql/util/JsfUtil.class./WEB-INF/classes/org/javaee/javaeemysql/util/PaginationHelper.class./WEB-INF/faces-config.xml./WEB-INF/web.xml List of files in the Spring 3.x project: ./META-INF ./META-INF/MANIFEST.MF./WEB-INF./WEB-INF/applicationContext.xml./WEB-INF/classes./WEB-INF/classes/log4j.properties./WEB-INF/classes/org./WEB-INF/classes/org/krams ./WEB-INF/classes/org/krams/tutorial ./WEB-INF/classes/org/krams/tutorial/controller ./WEB-INF/classes/org/krams/tutorial/controller/MainController.class ./WEB-INF/classes/org/krams/tutorial/domain ./WEB-INF/classes/org/krams/tutorial/domain/Person.class ./WEB-INF/classes/org/krams/tutorial/service ./WEB-INF/classes/org/krams/tutorial/service/PersonService.class ./WEB-INF/hibernate-context.xml ./WEB-INF/hibernate.cfg.xml ./WEB-INF/jsp ./WEB-INF/jsp/addedpage.jsp ./WEB-INF/jsp/addpage.jsp ./WEB-INF/jsp/deletedpage.jsp ./WEB-INF/jsp/editedpage.jsp ./WEB-INF/jsp/editpage.jsp ./WEB-INF/jsp/personspage.jsp ./WEB-INF/lib ./WEB-INF/lib/antlr-2.7.6.jar ./WEB-INF/lib/aopalliance-1.0.jar ./WEB-INF/lib/c3p0-0.9.1.2.jar ./WEB-INF/lib/cglib-nodep-2.2.jar ./WEB-INF/lib/commons-beanutils-1.8.3.jar ./WEB-INF/lib/commons-collections-3.2.1.jar ./WEB-INF/lib/commons-digester-2.1.jar ./WEB-INF/lib/commons-logging-1.1.1.jar ./WEB-INF/lib/dom4j-1.6.1.jar ./WEB-INF/lib/ejb3-persistence-1.0.2.GA.jar ./WEB-INF/lib/hibernate-annotations-3.4.0.GA.jar ./WEB-INF/lib/hibernate-commons-annotations-3.1.0.GA.jar ./WEB-INF/lib/hibernate-core-3.3.2.GA.jar ./WEB-INF/lib/javassist-3.7.ga.jar ./WEB-INF/lib/jstl-1.1.2.jar ./WEB-INF/lib/jta-1.1.jar ./WEB-INF/lib/junit-4.8.1.jar ./WEB-INF/lib/log4j-1.2.14.jar ./WEB-INF/lib/mysql-connector-java-5.1.14.jar ./WEB-INF/lib/persistence-api-1.0.jar ./WEB-INF/lib/slf4j-api-1.6.1.jar ./WEB-INF/lib/slf4j-log4j12-1.6.1.jar ./WEB-INF/lib/spring-aop-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-asm-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-beans-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-context-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-context-support-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-core-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-expression-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-jdbc-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-orm-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-tx-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-web-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-webmvc-3.0.5.RELEASE.jar ./WEB-INF/lib/standard-1.1.2.jar ./WEB-INF/lib/xml-apis-1.0.b2.jar ./WEB-INF/spring-servlet.xml ./WEB-INF/spring.properties ./WEB-INF/web.xml So, are you excited about Java EE 6 ? Want to get started now ? Here are some resources: Java EE 6 SDK (including runtime, samples, tutorials etc) GlassFish Server Open Source Edition 3.1.2 (Community) Oracle GlassFish Server 3.1.2 (Commercial) Java EE 6 using WebLogic 12c and NetBeans (Video) Java EE 6 with NetBeans and GlassFish (Video) Java EE with Eclipse and GlassFish (Video)

Read the article
IIS SSL Certificate Renewal Pain

- by Rick Strahl

I’m in the middle of my annual certificate renewal for the West Wind site and I can honestly say that I hate IIS’s certificate system. When it works it’s fine, but when it doesn’t man can it be a pain. Because I deal with public certificates on my site merely once a year, and you have to perform the certificate dance just the right way, I seem to run into some sort of trouble every year, thinking that Microsoft surely must have addressed the issues I ran into previously – HA! Not so. Don’t ever use the Renew Certificate Feature in IIS! The first rule that I should have never forgotten is that certificate renewals in IIS (7 is what I’m using but I think it’s no different in 7.5 and 8), simply don’t work if you’re submitting to get a public certificate from a certificate authority. I use DNSimple for my DNS domain management and SSL certificates because they provide ridiculously easy domain management and good prices for SSL certs – especially wildcard certificates, which is what I use on west-wind.com. Certificates in IIS can be found pegged to the machine root. If you go into the IIS Manager, go to the machine root the tree and then click on certificates and you then get various certificate options: Both of these options create a new Certificate request (CSR), which is just a text file. But if you’re silly enough like me to click on the Renew button on your old certificate, you’ll find that you end up generating a very long Certificate Request that looks nothing like the original certificate request and the format that’s used for this is not accepted by most certificate authorities. While I’m not sure exactly what the problem is, it simply looks like IIS is respecting none of your original certificate bit size choices and is generating a huge certificate request that is 3 times the size of a ‘normal’ certificate request. The end result is (and I’ve done this at least twice now) is that the certificate processor is likely to fail processing those renewals. Always create a new Certificate While it’s a little more work and you have to remember how to fill out the certificate request properly, this is the safe way to make sure your certificate generates properly. First comes the Distinguished Name Properties dialog: Ah yes you have to love the nomenclature of this stuff. Distinguished name, Common name – WTF is a common name? It doesn’t look common to me! Make sure this form gets filled out correctly. Common NameThis is the domain name of the Web site. In my case I’m creating a wildcard certificate so I’m using the * prefix. If you’re purchasing a certificate for a specific domain use www.west-wind.com or store.west-wind.com for example. Make sure this matches the EXACT domain you’re trying to use secure access on because that’s all the certificate is going to work on unless you get a wildcard certificate. Organization Is the name of your company or organization. Depending on the kind of certificate you purchase this name will show up on your certificate. Most low end SSL certificates (ie. those that cost under $100 for single domains) don’t list the organization, the higher signature certificates that also require extensive validation by the cert authority do. Regardless you should make sure this matches the right company/organization. Organizational Unit This can be anything. Not really sure what this is for, but traditionally I’ve always set this to Web because – well this is a Web thing after all right? I’ve never seen this used anywhere that I can tell other than to internally reference the cert. State and CountryPretty obvious. Should reflect the location of the business/organization/person or site. Next you have to configure the bit size used for the certificate: The default on this dialog is 1024, but I’ve found that most providers these days request a minimum bit length of 2048, as did my DNSimple provider. Again check with the provider when you submit to make sure. Bit length mismatches can cause problems if you use a size that isn’t supported by the provider. I had that happen last year when I submitted my CSR and it got rejected quite a bit later, when the certs usually are issued within an hour or less. When you’re done here, the certificate is saved to disk as a .txt file and it should look something like this (this is a 2048 bit length CSR):-----BEGIN NEW CERTIFICATE REQUEST----- MIIEVGCCAz0CAQAwdjELMAkGA1UEBhMCVVMxDzANBgNVBAgMBkhhd2FpaTENMAsG A1UEBwwEUGFpYTEfMB0GA1UECgwWV2VzdCBXaW5kIFRlY2hub2xvZ2llczEMMAoG B1UECwwDV2ViMRgwFgYDVQQDDA8qLndlc3Qtd2luZC5jb20wggEiMA0GCSqGSIb3 DQEBAQUAA4IBDwAwggEKAoIBAQDIPWOFMkMVRp2Ftj9w/cCVV4OYYhoZYtl+8lTk oqDwKca0xWHLgioX/9v0rZLS6a82MHqKEBxVXu+cuCmSE4AQtB/1YH9lS4tpc/be OZDvnTotP6l4MCEzzAfROcw4CiIg6X0RMSnl8IATAvv2V5LQM9TDdt9oDdMpX2IY +vVC9RZ7PMHBmR9kwI2i/lrKitzhQKaHgpmKcRlM6iqpALUiX28w5HJaDKK1MDHN 607tyFJLHijuJKx7PdTqZYf50KkC3NupfZ2avVycf18Q13jHWj59tvwEOczoVzRL l4LQivAqbhyiqMpWnrZunIOUZta5aGm+jo7O1knGWJjxuraTAgMBAAGgggGYMBoG CisGAQQBgjcNAgMxDBYKNi4yLjkyMDAuMjA0BgkrBgEEAYI3FRQxJzAlAgEFDAZS QVNYUFMMC1JBU1hQU1xSaWNrDAtJbmV0TWdyLmV4ZTByBgorBgEEAYI3DQICMWQw YgIBAR5aAE0AaQBjAHIAbwBzAG8AZgB0ACAAUgBTAEEAIABTAEMAaABhAG4AbgBl AGwAIABDAHIAeQBwAHQAbwBnAHIAYQBwAGgAaQBjACAAUAByAG8AdgBpAGQAZQBy AwEAMIHPBgkqhkiG9w0BCQ4xgcEwgb4wDgYDVR0PAQH/BAQDAgTwMBMGA1UdJQQM MAoGCCsGAQUFBwMBMHgGCSqGSIb3DQEJDwRrMGkwDgYIKoZIhvcNAwICAgCAMA4G CCqGSIb3DQMEAgIAgDALBglghkgBZQMEASowCwYJYIZIAWUDBAEtMAsGCWCGSAFl AwQBAjALBglghkgBZQMEAQUwBwYFKw4DAgcwCgYIKoZIhvcNAwcwHQYDVR0OBBYE FD/yOsTbXE+GVFCFMmldzQvyloz9MA0GCSqGSIb3DQEBBQUAA4IBAQCK6LlsCuIM 1AU0niB6QZ9v0FTsGFxP1dYvVUnJyY6VEKNiGFiQjZac7UCs0p58yScdXWEFOE8V OsjAYD3xYNc05+ckyD67UHRGEUAVB9RBvbKW23KeR/8kBmEzc8PemD52YOgExxAJ 57xWmAwEHAvbgYzQvhO8AOzH3TGvvHbg5UKM1pYgNmuwZq5DkL/IDoeIJwfk/wrI wghNTuxxIFgbH4YrgLgv4PRvrS/LaTCRBdboaCgzATMczaOb1nd/DVNR+3fCtMhM W0psTAjzRbmXF3nJyAQa7jF/52gkY0RfFX2lG5tJnG+XDsVNvKNvh9Qa5Tlmkm06 ILKCm9ciWCKk -----END NEW CERTIFICATE REQUEST----- You can take that certificate request and submit that to your certificate provider. Since this is base64 encoded you can typically just paste it into a text box on the submission page, or some providers will ask you to upload the CSR as a file. What does a Renewal look like? Note the length of the CSR will vary somewhat with key strength, but compare this to a renewal request that IIS generated from my existing site:-----BEGIN NEW CERTIFICATE REQUEST----- MIIPpwYFKoZIhvcNAQcCoIIPmDCCD5QCAQExCzAJBgUrDgMCGgUAMIIIqAYJKoZI hvcNAQcBoIIImQSCCJUwggiRMIIH+gIBADBdMSEwHwYDVQQLDBhEb21haW4gQ29u dHJvbCBWYWxpFGF0ZWQxHjAcBgNVBAsMFUVzc2VudGlhbFNTTCBXaWxkY2FyZDEY MBYGA1UEAwwPKi53ZXN0LXdpbmQuY29tMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCB iQKBgQCK4OuIOR18Wb8tNMGRZiD1c9X57b332Lj7DhbckFqLs0ys8kVDHrTXSj+T Ye9nmAvfPpZmBtE5p9qRNN79rUYugAdl+qEtE4IJe1bRfxXzcKa1SXa8+TEs3zQa zYSmcR2dDuC8om1eAdeCtt0NnkvANgm1VLwGOor/UHMASaEhCQIDAQABoIIG8jAa BgorBgEEAYI3DQIDMQwWCjYuMi45MjAwLjIwNAYJKwYBBAGCNxUUMScwJQIBBQwG UkFTWFBTDAtSQVNYUFNcUmljawwLSW5ldE1nci5leGUwZgYKKwYBBAGCNw0CAjFY MFYCAQIeTgBNAGkAYwByAG8AcwBvAGYAdAAgAFMAdAByAG8AbgBnACAAQwByAHkA cAB0AG8AZwByAGEAcABoAGkAYwAgAFAAcgBvAHYAaQBkAGUAcgMBADCCAQAGCSqG SIb3DQEJDjGB8jCB7zAOBgNVHQ8BAf8EBAMCBaAwDAYDVR0TAQH/BAIwADA0BgNV HSUELTArBggrBgEFBQcDAQYIKwYBBQUHAwIGCisGAQQBgjcKAwMGCWCGSAGG+EIE ATBPBgNVHSAESDBGMDoGCysGAQQBsjEBAgIHMCswKQYIKwYBBQUHAgEWHWh0dHBz Oi8vc2VjdXJlLmNvbW9kby5jb20vQ1BTMAgGBmeBDAECATApBgNVHREEIjAggg8q Lndlc3Qtd2luZC5jb22CDXdlc3Qtd2luZC5jb20wHQYDVR0OBBYEFEVLAyO8gDiv lsfovKrx9mHPyrsiMIIFMAYJKwYBBAGCNw0BMYIFITCCBR0wggQFoAMCAQICEQDu 1E1T5Jvtkm5LOfSHabWlMA0GCSqGSIb3DQEBBQUAMHIxCzAJBgNVBAYTAkdCMRsw GQYDVQQIExJHcmVhdGVyIE1hbmNoZXN0ZXIxEDAOBgNVBAcTB1NhbGZvcmQxGjAY BgNVBAoTEUNPTU9ETyBDQSBMaW1pdGVkMRgwFgYDVQQDEw9Fc3NlbnRpYWxTU0wg Q0EwHhcNMTQwNTA3MDAwMDAwWhcNMTUwNjA2MjM1OTU5WjBdMSEwHwYDVQQLExhE b21haW4gQ29udHJvbCBWYWxpZGF0ZWQxHjAcBgNVBAsTFUVzc2VudGlhbFNTTCBX aWxkY2FyZDEYMBYGA1UEAxQPKi53ZXN0LXdpbmQuY29tMIIBIjANBgkqhkiG9w0B AQEFAAOCAQ8AMIIBCgKCAQEAiyKfL66XB51DlUfm6xXqJBcvMU2qorRHxC+WjEpB amvg8XoqNfCKzDAvLMbY4BLhbYCTagqtslnP3Gj4AKhXqRKU0n6iSbmS1gcWzCJM CHufZ5RDtuTuxhTdJxzP9YqZUfKV5abWQp/TK6V1ryaBJvdqM73q4tRjrQODtkiR PfZjxpybnBHFJS8jYAf8jcOjSDZcgN1d9Evc5MrEJCp/90cAkozyF/NMcFtD6Yj8 UM97z3MzDT2JPDoH3kAr3cCgpUNyQ2+wDNCnL9eWYFkOQi8FZMsZol7KlZ5NgNfO a7iZMVGbqDg6rkS//2uGe6tSQJTTs+mAZB+na+M8XT2UqwIDAQABo4IBwTCCAb0w HwYDVR0jBBgwFoAU2svqrVsIXcz//CZUzknlVcY49PgwHQYDVR0OBBYEFH0AmLiL RSEL9+sQD/n5O4N7/nnqMA4GA1UdDwEB/wQEAwIFoDAMBgNVHRMBAf8EAjAAMDQG A1UdJQQtMCsGCCsGAQUFBwMBBggrBgEFBQcDAgYKKwYBBAGCNwoDAwYJYIZIAYb4 QgQBME8GA1UdIARIMEYwOgYLKwYBBAGyMQECAgcwKzApBggrBgEFBQcCARYdaHR0 cHM6Ly9zZWN1cmUuY29tb2RvLmNvbS9DUFMwCAYGZ4EMAQIBMDsGA1UdHwQ0MDIw MKAuoCyGKmh0dHA6Ly9jcmwuY29tb2RvY2EuY29tL0Vzc2VudGlhbFNTTENBLmNy bDBuBggrBgEFBQcBAQRiMGAwOAYIKwYBBQUHMAKGLGh0dHA6Ly9jcnQuY29tb2Rv Y2EuY29tL0Vzc2VudGlhbFNTTENBXzIuY3J0MCQGCCsGAQUFBzABhhhodHRwOi8v b2NzcC5jb21vZG9jYS5jb20wKQYDVR0RBCIwIIIPKi53ZXN0LXdpbmQuY29tgg13 ZXN0LXdpbmQuY29tMA0GCSqGSIb3DQEBBQUAA4IBAQBqBfd6QHrxXsfgfKARG6np 8yszIPhHGPPmaE7xq7RpcZjY9H+8l6fe4jQbGFjbA5uHBklYI4m2snhPaW2p8iF8 YOkm2V2hEsSTnkf5/flw9mZtlCFEDFXSsBxBdNz8RYTthPMu1h09C0XuDB30sztg nR692FrxJN5/bXsk+MC9nEweTFW/t2HW+XZ8bhM7vsAS+pZionR4MyuQ0mYIt/lD csZVZ91KxTsIm8rNMkkYGFoSIXjQ0+0tCbxMF0i2qnpmNRpA6PU8l7lxxvPkplsk 9KB8QIPFrR5p/i/SUAd9vECWh5+/ktlcrfFP2PK7XcEwWizsvMrNqLyvQVNXSUPT MA0GCSqGSIb3DQEBBQUAA4GBABt/NitwMzc5t22p5+zy4HXbVYzLEjesLH8/v0ot uLQ3kkG8tIWNh5RplxIxtilXt09H4Oxpo3fKUN0yw+E6WsBfg0sAF8pHNBdOJi48 azrQbt4HvKktQkGpgYFjLsormjF44SRtToLHlYycDHBNvjaBClUwMCq8HnwY6vDq xikRoIIFITCCBR0wggQFoAMCAQICEQDu1E1T5Jvtkm5LOfSHabWlMA0GCSqGSIb3 DQEBBQUAMHIxCzAJBgNVBAYTAkdCMRswGQYDVQQIExJHcmVhdGVyIE1hbmNoZXN0 ZXIxEDAOBgNVBAcTB1NhbGZvcmQxGjAYBgNVBAoTEUNPTU9ETyBDQSBMaW1pdGVk MRgwFgYDVQQDEw9Fc3NlbnRpYWxTU0wgQ0EwHhcNMTQwNTA3MDAwMDAwWhcNMTUw NjA2MjM1OTU5WjBdMSEwHwYDVQQLExhEb21haW4gQ29udHJvbCBWYWxpZGF0ZWQx HjAcBgNVBAsTFUVzc2VudGlhbFNTTCBXaWxkY2FyZDEYMBYGA1UEAxQPKi53ZXN0 LXdpbmQuY29tMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAiyKfL66X B51DlUfm6xXqJBcvMU2qorRHxC+WjEpBamvg8XoqNfCKzDAvLMbY4BLhbYCTagqt slnP3Gj4AKhXqRKU0n6iSbmS1gcWzCJMCHufZ5RDtuTuxhTdJxzP9YqZUfKV5abW Qp/TK6V1ryaBJvdqM73q4tRjrQODtkiRPfZjxpybnBHFJS8jYAf8jcOjSDZcgN1d 9Evc5MrEJCp/90cAkozyF/NMcFtD6Yj8UM97z3MzDT2JPDoH3kAr3cCgpUNyQ2+w DNCnL9eWYFkOQi8FZMsZol7KlZ5NgNfOa7iZMVGbqDg6rkS//2uGe6tSQJTTs+mA ZB+na+M8XT2UqwIDAQABo4IBwTCCAb0wHwYDVR0jBBgwFoAU2svqrVsIXcz//CZU zknlVcY49PgwHQYDVR0OBBYEFH0AmLiLRSEL9+sQD/n5O4N7/nnqMA4GA1UdDwEB /wQEAwIFoDAMBgNVHRMBAf8EAjAAMDQGA1UdJQQtMCsGCCsGAQUFBwMBBggrBgEF BQcDAgYKKwYBBAGCNwoDAwYJYIZIAYb4QgQBME8GA1UdIARIMEYwOgYLKwYBBAGy MQECAgcwKzApBggrBgEFBQcCARYdaHR0cHM6Ly9zZWN1cmUuY29tb2RvLmNvbS9D UFMwCAYGZ4EMAQIBMDsGA1UdHwQ0MDIwMKAuoCyGKmh0dHA6Ly9jcmwuY29tb2Rv Y2EuY29tL0Vzc2VudGlhbFNTTENBLmNybDBuBggrBgEFBQcBAQRiMGAwOAYIKwYB BQUHMAKGLGh0dHA6Ly9jcnQuY29tb2RvY2EuY29tL0Vzc2VudGlhbFNTTENBXzIu Y3J0MCQGCCsGAQUFBzABhhhodHRwOi8vb2NzcC5jb21vZG9jYS5jb20wKQYDVR0R BCIwIIIPKi53ZXN0LXdpbmQuY29tgg13ZXN0LXdpbmQuY29tMA0GCSqGSIb3DQEB BQUAA4IBAQBqBfd6QHrxXsfgfKARG6np8yszIPhHGPPmaE7xq7RpcZjY9H+8l6fe 4jQbGFjbA5uHBklYI4m2snhPaW2p8iF8YOkm2V2hEsSTnkf5/flw9mZtlCFEDFXS sBxBdNz8RYTthPMu1h09C0XuDB30sztgnR692FrxJN5/bXsk+MC9nEweTFW/t2HW +XZ8bhM7vsAS+pZionR4MyuQ0mYIt/lDcsZVZ91KxTsIm8rNMkkYGFoSIXjQ0+0t CbxMF0i2qnpmNRpA6PU8l7lxxvPkplsk9KB8QIPFrR5p/i/SUAd9vECWh5+/ktlc rfFP2PK7XcEwWizsvMrNqLyvQVNXSUPTMYIBrzCCAasCAQEwgYcwcjELMAkGA1UE BhMCR0IxGzAZBgNVBAgTEkdyZWF0ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2Fs Zm9yZDEaMBgGA1UEChMRQ09NT0RPIENBIExpbWl0ZWQxGDAWBgNVBAMTD0Vzc2Vu dGlhbFNTTCBDQQIRAO7UTVPkm+2Sbks59IdptaUwCQYFKw4DAhoFADANBgkqhkiG 9w0BAQEFAASCAQB8PNQ6bYnQpWfkHyxnDuvNKw3wrqF2p7JMZm+SuN2qp3R2LpCR mW2LrGtQIm9Iob/QOYH+8houYNVdvsATGPXX2T8gzn+anof4tOG0vCTK1Bp9bwf9 MkRP+1c8RW/vkYmUW4X5/C+y3CZpMH5dDTaXBIpXFzjX/fxNpH/rvLzGiaYYL3Cn OLO+aOADr9qq5yoqwpiYCSfYNNYKTUNNGfYIidQwYtbHXEYhSukB2oR89xD2sZZ4 bOqFjUPgTa5SsERLDDeg3omMKiIXVYGxlqBEq51Kge6IQt4qQV9P9VgInW7cWmKe dTqNHI9ri3ttewdEnT++TKGKKfTjX9SR8Waj -----END NEW CERTIFICATE REQUEST----- Clearly there’s something very different between this an my original request! And it didn’t work. IIS creates a custom CSR that is encoded in a format that no certificate authority I’ve ever used uses. If you want the gory details of what’s in there look at this ServerFault question (thanks to Mika in the comments). In the end it doesn’t matter though – no certificate authority knows what to do with this CSR. So create a new CSR and skip the renewal. Always! Use the same Server Keep in mind that on IIS at least you should always create your certificate on a single server and then when you receive the final certificate from your provider import it on that server. IIS tracks the CSR it created and requires it in order to import the final certificate properly. So if for some reason you try to install the certificate on another server, it won’t work. I’ve also run into trouble trying to install the same certificate twice – this time around I didn’t give my certificate the proper friendly name and IIS failed to allow me to assign the certificate to any of my Web sites. So I removed the certificate and tried to import again, only to find it failed the second time around. There are other ways to fix this, but in my case I had to have the certificate re-issued to work – not what you want to do. Regardless of what you do though, when you import make sure you do it right the first time by crossing all your t’s and dotting your i's– it’ll save you a lot of grief! You don’t actually have to use the server that the certificate gets installed on to generate the CSR and first install it, but it is generally a good idea to do so just so you can get the certificate installed into the right place right away. If you have access to the server where you need to install the certificate you might as well use it. But you can use another machine to generated the and install the certificate, then export the certificate and move it to another machine as needed. So you can use your Dev machine to create a certificate then export it and install it on a live server. More on installation and back up/export later. Installing the Certificate Once you’ve submitted a CSR request your provider will process the request and eventually issue you a new final certificate that contains another text file with the final key to import into your certificate store. IIS does this by combining the content in your certificate request with the original CSR. If all goes well your new certificate shows up in the certificate list and you’re ready to assign the certificate to your sites. Make sure you use a friendly name that matches domain name of your site. So use *.mysite.com or www.mysite.com or store.mysite.com to ensure IIS recognizes the certificate. I made the mistake of not naming my friendly name this way and found that IIS was unable to link my sites to my wildcard certificate. It needed to have the *. as part of the certificate otherwise the Hostname input field was blanked out. Changing the Friendly Name If you by accidentally used an invalid friendly name you can change it later in the Windows certificate store. Bring up a Run Box Type MMC File | Add/Remove Snap In Add Certificates | Computer Account | Local Computer Drill into Certificates | Personal | Certificates Find your Certificate | Right Click | Properties Edit the Friendly Name | Click OK Backing up your Certificate The first thing you should do once your certificate is successfully installed is to back it up! In case your server crashes or you otherwise lose your configuration this will ensure you have an easy way to recover and reinstall your certificate either on the same server or a different one. If you’re running a server farm or using a wildcard certificate you also need to get the certificate onto other machines and a PFX file import is the easiest way to do this. To back up your certificate select your certificate and choose Export from the context or sidebar menu: The Export Certificate option allows you to export a password protected binary file that you can import in a single step. You can copy the resulting binary PFX file to back up or copy to other machines to install on. Importing the certificate on another machine is as easy as pointing at the PFX file and specifying the password. IIS handles the rest. Assigning a new certificate to your Site Once you have the new certificate installed, all that’s left to do is assign it to your site. In IIS select your Web site and bring up the Site Bindings from the right sidebar. Add a new binding for https, bind it to port 443, specify your hostname and pick the certificate from the pick list. If you’re using a root site make sure to set up your certificate for www.yoursite.com and also for yoursite.com so that both work properly with SSL. Note that you need to explicitly configure each hostname for a certificate if you plan to use SSL. Luckily if you update your SSL certificate in the following year, IIS prompts you and asks whether you like to update all other sites that are using the existing cert to the newer cert. And you’re done. So what’s the Pain? So, all of this is old hat and it doesn’t look all that bad right? So what’s the pain here? Well if you follow the instructions and do everything right, then the process is about as straight forward as you would expect it to be. You create a cert request, you import it and assign it to your sites. That’s the basic steps and to be perfectly fair it works well – if nothing goes wrong. However, renewing tends to be the problem. The first unintuitive issue is that you simply shouldn’t renew but create a new CSR and generate your new certificate from that. Over the years I’ve fallen prey to the belief that Microsoft eventually will fix this so that the renewal creates the same type of CSR as the old cert, but apparently that will just never happen. Booo! The other problem I ran into is that I accidentally misnamed my imported certificate which in turn set off a chain of events that caused my originally issued certificate to become uninstallable. When I received my completed certificate I installed it and it installed just fine, but the friendly name was wrong. As a result IIS refused to assign the certificate to any of my host headered sites. That’s strike number one. Why the heck should the friendly name have any effect on the ability to attach the certificate??? Next I uninstalled the certificate because I figured that would be the easiest way to make sure I get it right. But I found that I could not reinstall my certificate. I kept getting these stop errors: "ASN1 bad tag value met" that would prevent the installation from completion. After searching around for this error and reading countless long messages on forums, I found that this error supposedly does not actually mean the install failed, but the list wouldn’t refresh. Commodo has this to say: Note: There is a known issue in IIS 7 giving the following error: "Cannot find the certificate request associated with this certificate file. A certificate request must be completed on the computer where it was created." You may also receive a message stating "ASN1 bad tag value met". If this is the same server that you generated the CSR on then, in most cases, the certificate is actually installed. Simply cancel the dialog and press "F5" to refresh the list of server certificates. If the new certificate is now in the list, you can continue with the next step. If it is not in the list, you will need to reissue your certificate using a new CSR (see our CSR creation instructions for IIS 7). After creating a new CSR, login to your Comodo account and click the 'replace' button for your certificate. Not sure if this issue is fixed in IIS 8 but that’s an insane bug to have crop up. As it turns out, in my case the refresh didn’t work and the certificate didn’t show up in the IIS list after the reinstall. In fact when looking at the certificate store I could see my certificate was installed in the right place, but the private key is missing which is most likely why IIS is not picking it up. It looks like IIS could not match the final cert to the original CSR generated. But again some sort of message to that affect might be helpful instead of ASN1 bad tag value met. Recovering the Private Key So it turns out my original problem was that I received the published key, but when I imported the private key was missing. There’s a relatively easy way to recover from this. If your certificate doesn’t show up in IIS check in the certificate store for the local machine (see steps above on how to bring this up). If you look at the certificate in Certificates/Personal/Certificates make sure you see the key as shown in the image below: if the key is missing it means that the certificate is missing the private key most likely. To fix a certificate you can do the following: Double click the certificate Go to the Details Tab Copy down the Serial number You can copy the serial number from the area blurred out above. The serial number will be in a format like ?00 a7 9b a1 a4 9d 91 63 57 d6 9f 26 b8 ee 79 b5 cb and you’ll need to strip out the spaces in order to use it in the next step. Next open up an Administrative command prompt and issue the following command: certutil -repairstore my 00a79ba1a49d916357d69f26b8ee79b5cb You should get a confirmation message that the repair worked. If you now go back to the certificate store you should now see the key icon show up on the certificate. Your certificate is fixed. Now go back into IIS Manager and refresh the list of certificates and if all goes well you should see all the certificates that showed in the cert store now: Remember – back up the key first then map to your site… Summary I deal with a lot of customers who run their own IIS servers, and I can’t tell you how often I hear about botched SSL installations. When I posted some of my issues on Twitter yesterday I got a hell storm of “me too” responses. I’m clearly not the only one, who’s run into this especially with renewals. I feel pretty comfortable with IIS configuration and I do a lot of it for support purposes, but the SSL configuration is one that never seems to go seamlessly. This blog post is meant as reminder to myself to read next time I do a renewal. So I can dot my i's and dash my t’s before I get caught in the mess I’m dealing with today. Hopefully some of you find this useful as well.© Rick Strahl, West Wind Technologies, 2005-2014Posted in IIS7 Security Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
Improving Partitioned Table Join Performance

- by Paul White

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) ); CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY); WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50); -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE); -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory: Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); CREATE TABLE #P ( partition_number integer PRIMARY KEY); INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1; SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; DROP TABLE #P; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

Read the article
Towards Database Continuous Delivery – What Next after Continuous Integration? A Checklist

- by Ben Rees

.dbd-banner p{ font-size:0.75em; padding:0 0 10px; margin:0 } .dbd-banner p span{ color:#675C6D; } .dbd-banner p:last-child{ padding:0; } @media ALL and (max-width:640px){ .dbd-banner{ background:#f0f0f0; padding:5px; color:#333; margin-top: 5px; } } -- Database delivery patterns & practices STAGE 4 AUTOMATED DEPLOYMENT If you’ve been fortunate enough to get to the stage where you’ve implemented some sort of continuous integration process for your database updates, then hopefully you’re seeing the benefits of that investment – constant feedback on changes your devs are making, advanced warning of data loss (prior to the production release on Saturday night!), a nice suite of automated tests to check business logic, so you know it’s going to work when it goes live, and so on. But what next? What can you do to improve your delivery process further, moving towards a full continuous delivery process for your database? In this article I describe some of the issues you might need to tackle on the next stage of this journey, and how to plan to overcome those obstacles before they appear. Our Database Delivery Learning Program consists of four stages, really three – source controlling a database, running continuous integration processes, then how to set up automated deployment (the middle stage is split in two – basic and advanced continuous integration, making four stages in total). If you’ve managed to work through the first three of these stages – source control, basic, then advanced CI, then you should have a solid change management process set up where, every time one of your team checks in a change to your database (whether schema or static reference data), this change gets fully tested automatically by your CI server. But this is only part of the story. Great, we know that our updates work, that the upgrade process works, that the upgrade isn’t going to wipe our 4Tb of production data with a single DROP TABLE. But – how do you get this (fully tested) release live? Continuous delivery means being always ready to release your software at any point in time. There’s a significant gap between your latest version being tested, and it being easily releasable. Just a quick note on terminology – there’s a nice piece here from Atlassian on the difference between continuous integration, continuous delivery and continuous deployment. This piece also gives a nice description of the benefits of continuous delivery. These benefits have been summed up by Jez Humble at Thoughtworks as: “Continuous delivery is a set of principles and practices to reduce the cost, time, and risk of delivering incremental changes to users” There’s another really useful piece here on Simple-Talk about the need for continuous delivery and how it applies to the database written by Phil Factor – specifically the extra needs and complexities of implementing a full CD solution for the database (compared to just implementing CD for, say, a web app). So, hopefully you’re convinced of moving on the the next stage! The next step after CI is to get some sort of automated deployment (or “release management”) process set up. But what should I do next? What do I need to plan and think about for getting my automated database deployment process set up? Can’t I just install one of the many release management tools available and hey presto, I’m ready! If only it were that simple. Below I list some of the areas that it’s worth spending a little time on, where a little planning and prep could go a long way. It’s also worth pointing out, that this should really be an evolving process. Depending on your starting point of course, it can be a long journey from your current setup to a full continuous delivery pipeline. If you’ve got a CI mechanism in place, you’re certainly a long way down that path. Nevertheless, we’d recommend evolving your process incrementally. Pages 157 and 129-141 of the book on Continuous Delivery (by Jez Humble and Dave Farley) have some great guidance on building up a pipeline incrementally: http://www.amazon.com/Continuous-Delivery-Deployment-Automation-Addison-Wesley/dp/0321601912 For now, in this post, we’ll look at the following areas for your checklist: You and Your Team Environments The Deployment Process Rollback and Recovery Development Practices You and Your Team It’s a cliché in the DevOps community that “It’s not all about processes and tools, really it’s all about a culture”. As stated in this DevOps report from Puppet Labs: “DevOps processes and tooling contribute to high performance, but these practices alone aren’t enough to achieve organizational success. The most common barriers to DevOps adoption are cultural: lack of manager or team buy-in, or the value of DevOps isn’t understood outside of a specific group”. Like most clichés, there’s truth in there – if you want to set up a database continuous delivery process, you need to get your boss, your department, your company (if relevant) onside. Why? Because it’s an investment with the benefits coming way down the line. But the benefits are huge – for HP, in the book A Practical Approach to Large-Scale Agile Development: How HP Transformed LaserJet FutureSmart Firmware, these are summarized as: -2008 to present: overall development costs reduced by 40% -Number of programs under development increased by 140% -Development costs per program down 78% -Firmware resources now driving innovation increased by a factor of 8 (from 5% working on new features to 40% But what does this mean? It means that, when moving to the next stage, to make that extra investment in automating your deployment process, it helps a lot if everyone is convinced that this is a good thing. That they understand the benefits of automated deployment and are willing to make the effort to transform to a new way of working. Incidentally, if you’re ever struggling to convince someone of the value I’d strongly recommend just buying them a copy of this book – a great read, and a very practical guide to how it can really work at a large org. I’ve spoken to many customers who have implemented database CI who describe their deployment process as “The point where automation breaks down. Up to that point, the CI process runs, untouched by human hand, but as soon as that’s finished we revert to manual.” This deployment process can involve, for example, a DBA manually comparing an environment (say, QA) to production, creating the upgrade scripts, reading through them, checking them against an Excel document emailed to him/her the night before, turning to page 29 in his/her notebook to double-check how replication is switched off and on for deployments, and so on and so on. Painful, error-prone and lengthy. But the point is, if this is something like your deployment process, telling your DBA “We’re changing everything you do and your toolset next week, to automate most of your role – that’s okay isn’t it?” isn’t likely to go down well. There’s some work here to bring him/her onside – to explain what you’re doing, why there will still be control of the deployment process and so on. Or of course, if you’re the DBA looking after this process, you have to do a similar job in reverse. You may have researched and worked out how you’d like to change your methodology to start automating your painful release process, but do the dev team know this? What if they have to start producing different artifacts for you? Will they be happy with this? Worth talking to them, to find out. As well as talking to your DBA/dev team, the other group to get involved before implementation is your manager. And possibly your manager’s manager too. As mentioned, unless there’s buy-in “from the top”, you’re going to hit problems when the implementation starts to get rocky (and what tool/process implementations don’t get rocky?!). You need to have support from someone senior in your organisation – someone you can turn to when you need help with a delayed implementation, lack of resources or lack of progress. Actions: Get your DBA involved (or whoever looks after live deployments) and discuss what you’re planning to do or, if you’re the DBA yourself, get the dev team up-to-speed with your plans, Get your boss involved too and make sure he/she is bought in to the investment. Environments Where are you going to deploy to? And really this question is – what environments do you want set up for your deployment pipeline? Assume everyone has “Production”, but do you have a QA environment? Dedicated development environments for each dev? Proper pre-production? I’ve seen every setup under the sun, and there is often a big difference between “What we want, to do continuous delivery properly” and “What we’re currently stuck with”. Some of these differences are: What we want What we’ve got Each developer with their own dedicated database environment A single shared “development” environment, used by everyone at once An Integration box used to test the integration of all check-ins via the CI process, along with a full suite of unit-tests running on that machine In fact if you have a CI process running, you’re likely to have some sort of integration server running (even if you don’t call it that!). Whether you have a full suite of unit tests running is a different question… Separate QA environment used explicitly for manual testing prior to release “We just test on the dev environments, or maybe pre-production” A proper pre-production (or “staging”) box that matches production as closely as possible Hopefully a pre-production box of some sort. But does it match production closely!? A production environment reproducible from source control A production box which has drifted significantly from anything in source control The big question is – how much time and effort are you going to invest in fixing these issues? In reality this just involves figuring out which new databases you’re going to create and where they’ll be hosted – VMs? Cloud-based? What about size/data issues – what data are you going to include on dev environments? Does it need to be masked to protect access to production data? And often the amount of work here really depends on whether you’re working on a new, greenfield project, or trying to update an existing, brownfield application. There’s a world if difference between starting from scratch with 4 or 5 clean environments (reproducible from source control of course!), and trying to re-purpose and tweak a set of existing databases, with all of their surrounding processes and quirks. But for a proper release management process, ideally you have: Dedicated development databases, An Integration server used for testing continuous integration and running unit tests. [NB: This is the point at which deployments are automatic, without human intervention. Each deployment after this point is a one-click (but human) action], QA – QA engineers use a one-click deployment process to automatically* deploy chosen releases to QA for testing, Pre-production. The environment you use to test the production release process, Production. * A note on the use of the word “automatic” – when carrying out automated deployments this does not mean that the deployment is happening without human intervention (i.e. that something is just deploying over and over again). It means that the process of carrying out the deployment is automatic in that it’s not a person manually running through a checklist or set of actions. The deployment still requires a single-click from a user. Actions: Get your environments set up and ready, Set access permissions appropriately, Make sure everyone understands what the environments will be used for (it’s not a “free-for-all” with all environments to be accessed, played with and changed by development). The Deployment Process As described earlier, most existing database deployment processes are pretty manual. The following is a description of a process we hear very often when we ask customers “How do your database changes get live? How does your manual process work?” Check pre-production matches production (use a schema compare tool, like SQL Compare). Sometimes done by taking a backup from production and restoring in to pre-prod, Again, use a schema compare tool to find the differences between the latest version of the database ready to go live (i.e. what the team have been developing). This generates a script, User (generally, the DBA), reviews the script. This often involves manually checking updates against a spreadsheet or similar, Run the script on pre-production, and check there are no errors (i.e. it upgrades pre-production to what you hoped), If all working, run the script on production.* * this assumes there’s no problem with production drifting away from pre-production in the interim time period (i.e. someone has hacked something in to the production box without going through the proper change management process). This difference could undermine the validity of your pre-production deployment test. Red Gate is currently working on a free tool to detect this problem – sign up here at www.sqllighthouse.com, if you’re interested in testing early versions. There are several variations on this process – some better, some much worse! How do you automate this? In particular, step 3 – surely you can’t automate a DBA checking through a script, that everything is in order!? The key point here is to plan what you want in your new deployment process. There are so many options. At one extreme, pure continuous deployment – whenever a dev checks something in to source control, the CI process runs (including extensive and thorough testing!), before the deployment process keys in and automatically deploys that change to the live box. Not for the faint hearted – and really not something we recommend. At the other extreme, you might be more comfortable with a semi-automated process – the pre-production/production matching process is automated (with an error thrown if these environments don’t match), followed by a manual intervention, allowing for script approval by the DBA. One he/she clicks “Okay, I’m happy for that to go live”, the latter stages automatically take the script through to live. And anything in between of course – and other variations. But we’d strongly recommended sitting down with a whiteboard and your team, and spending a couple of hours mapping out “What do we do now?”, “What do we actually want?”, “What will satisfy our needs for continuous delivery, but still maintaining some sort of continuous control over the process?” NB: Most of what we’re discussing here is about production deployments. It’s important to note that you will also need to map out a deployment process for earlier environments (for example QA). However, these are likely to be less onerous, and many customers opt for a much more automated process for these boxes. Actions: Sit down with your team and a whiteboard, and draw out the answers to the questions above for your production deployments – “What do we do now?”, “What do we actually want?”, “What will satisfy our needs for continuous delivery, but still maintaining some sort of continuous control over the process?” Repeat for earlier environments (QA and so on). Rollback and Recovery If only every deployment went according to plan! Unfortunately they don’t – and when things go wrong, you need a rollback or recovery plan for what you’re going to do in that situation. Once you move in to a more automated database deployment process, you’re far more likely to be deploying more frequently than before. No longer once every 6 months, maybe now once per week, or even daily. Hence the need for a quick rollback or recovery process becomes paramount, and should be planned for. NB: These are mainly scenarios for handling rollbacks after the transaction has been committed. If a failure is detected during the transaction, the whole transaction can just be rolled back, no problem. There are various options, which we’ll explore in subsequent articles, things like: Immediately restore from backup, Have a pre-tested rollback script (remembering that really this is a “roll-forward” script – there’s not really such a thing as a rollback script for a database!) Have fallback environments – for example, using a blue-green deployment pattern. Different options have pros and cons – some are easier to set up, some require more investment in infrastructure; and of course some work better than others (the key issue with using backups, is loss of the interim transaction data that has been added between the failed deployment and the restore). The best mechanism will be primarily dependent on how your application works and how much you need a cast-iron failsafe mechanism. Actions: Work out an appropriate rollback strategy based on how your application and business works, your appetite for investment and requirements for a completely failsafe process. Development Practices This is perhaps the more difficult area for people to tackle. The process by which you can deploy database updates is actually intrinsically linked with the patterns and practices used to develop that database and linked application. So you need to decide whether you want to implement some changes to the way your developers actually develop the database (particularly schema changes) to make the deployment process easier. A good example is the pattern “Branch by abstraction”. Explained nicely here, by Martin Fowler, this is a process that can be used to make significant database changes (e.g. splitting a table) in a step-wise manner so that you can always roll back, without data loss – by making incremental updates to the database backward compatible. Slides 103-108 of the following slidedeck, from Niek Bartholomeus explain the process: https://speakerdeck.com/niekbartho/orchestration-in-meatspace As these slides show, by making a significant schema change in multiple steps – where each step can be rolled back without any loss of new data – this affords the release team the opportunity to have zero-downtime deployments with considerably less stress (because if an increment goes wrong, they can roll back easily). There are plenty more great patterns that can be implemented – the book Refactoring Databases, by Scott Ambler and Pramod Sadalage is a great read, if this is a direction you want to go in: http://www.amazon.com/Refactoring-Databases-Evolutionary-paperback-Addison-Wesley/dp/0321774515 But the question is – how much of this investment are you willing to make? How often are you making significant schema changes that would require these best practices? Again, there’s a difference here between migrating old projects and starting afresh – with the latter it’s much easier to instigate best practice from the start. Actions: For your business, work out how far down the path you want to go, amending your database development patterns to “best practice”. It’s a trade-off between implementing quality processes, and the necessity to do so (depending on how often you make complex changes). Socialise these changes with your development group. No-one likes having “best practice” changes imposed on them, so good to introduce these ideas and the rationale behind them early. Summary The next stages of implementing a continuous delivery pipeline for your database changes (once you have CI up and running) require a little pre-planning, if you want to get the most out of the work, and for the implementation to go smoothly. We’ve covered some of the checklist of areas to consider – mainly in the areas of “Getting the team ready for the changes that are coming” and “Planning our your pipeline, environments, patterns and practices for development”, though there will be more detail, depending on where you’re coming from – and where you want to get to. This article is part of our database delivery patterns & practices series on Simple Talk. Find more articles for version control, automated testing, continuous integration & deployment.

Read the article
JQuery Hover li Show div which sits outside li structure

- by Dave_Stott

Hi everyone I'm currently trying to create a "mega" dropout menu using JQuery but have encountered an issue I'm yet to be able to resolve. At the moment I have the following HTML structure: <div id="TopNav" class="grid_16"> <ul class="cmsListMenuUL level0" id="TopNavMenu"> <li class="cmsListMenuLIcmsListMenuLI highlightedLI" id="TopNavMenu_Home"><a href="/"> <span class="text">Home</span></a></li> <li class="cmsListMenuLIfirst" id="TopNavMenu_0_1"><a href="/Key-Sectors.aspx" class="cmsListMenuLink"> <span class="text">Key Sectors</span></a></li> <li class="cmsListMenuLI" id="TopNavMenu_0_2"><a href="/Global-Brands.aspx" class="cmsListMenuLink"> <span class="text">Global Brands</span></a></li> <li class="cmsListMenuLI" id="TopNavMenu_0_3"><a href="/News---Features.aspx" class="cmsListMenuLink"> <span class="text">News & Features</span></a></li> <li class="cmsListMenuLI" id="TopNavMenu_0_4"><a href="/Videos.aspx" class="cmsListMenuLink"> <span class="text">Videos</span></a></li> <li class="cmsListMenuLI" id="TopNavMenu_0_5"><a href="/Events.aspx" class="cmsListMenuLink"> <span class="text">Events</span></a></li> <li class="cmsListMenuLI" id="TopNavMenu_0_6"><a href="/Key-Cities.aspx" class="cmsListMenuLink"> <span class="text">Key Cities</span></a></li> <li class="cmsListMenuLI" id="TopNavMenu_0_7"><a href="/Doing-Business-in-Yorkshire.aspx" class="cmsListMenuLink"><span class="text">Doing Business in Yorkshire</span></a></li> <li class="cmsListMenuLI" id="TopNavMenu_0_8"><a href="/How-We-Can-Help.aspx" class="cmsListMenuLink"> <span class="text">How We Can Help</span></a></li> <li class="cmsListMenuLI" id="TopNavMenu_0_9"><a href="/Contact-Us.aspx" class="cmsListMenuLink"> <span class="text">Contact Us</span></a></li> </ul> </div> <div class="sectorsDropped"> <div class="floatLeft leftColumn"> <div class="parentItem" style="border-color: #0064BE;"> <a href="/Key-Sectors/Advanced-Engineering---Materials.aspx" class="parentItemContent"> Advanced Engineering & Materials</a><div class="childItem"> <a href="/Key-Sectors/Advanced-Engineering---Materials/Nuclear.aspx">- Nuclear</a></div> <div class="childItem"> <a href="/Key-Sectors/Advanced-Engineering---Materials/Logistics---Infrastructure.aspx"> - Logistics & Infrastructure</a></div> </div> <div class="parentItem" style="border-color: #FFB611;"> <a href="/Key-Sectors/Chemicals.aspx" class="parentItemContent">Chemicals</a></div> <div class="parentItem" style="border-color: #B7CC0B;"> <a href="/Key-Sectors/Environmental-Technologies.aspx" class="parentItemContent">Environmental Technologies</a><div class="childItem"> <a href="/Key-Sectors/Environmental-Technologies/Offshore-Wind.aspx">- Offshore Wind</a></div> <div class="childItem"> <a href="/Key-Sectors/Environmental-Technologies/Carbon-Capture---Storage.aspx">- Carbon Capture & Storage</a></div> <div class="childItem"> <a href="/Key-Sectors/Environmental-Technologies/Tidal-Power.aspx">- Tidal Power</a></div> <div class="childItem"> <a href="/Key-Sectors/Environmental-Technologies/Biomass.aspx">- Biomass</a></div> </div> </div> <div class="floatLeft rightColumn"> <div class="parentItem" style="border-color: #AC26AA;"> <a href="/Key-Sectors/Digital---New-Media.aspx" class="parentItemContent">Digital & New Media</a></div> <div class="parentItem" style="border-color: #e1477e;"> <a href="/Key-Sectors/Food---Drink.aspx" class="parentItemContent">Food & Drink</a></div> <div class="parentItem" style="border-color: #00c5b5;"> <a href="/Key-Sectors/Healthcare-Technologies.aspx" class="parentItemContent">Healthcare Technologies</a><div class="childItem"> <a href="/Key-Sectors/Healthcare-Technologies/Biotechnology.aspx">- Biotechnology</a></div> <div class="childItem"> <a href="/Key-Sectors/Healthcare-Technologies/Pharmaceuticals.aspx">- Pharmaceuticals</a></div> <div class="childItem"> <a href="/Key-Sectors/Healthcare-Technologies/Medical-Devices.aspx">- Medical Devices</a></div> </div> <div class="parentItem" style="border-color: #AC1A2F;"> <a href="/Key-Sectors/Financial---Professional.aspx" class="parentItemContent">Financial & Professional</a></div> </div> </div> In normal circumstances the div containing the "mega" menu options would sit inside the li item that fires the show/hide but this is currently not possible as the ul list of navigation links is rendered using a 3rd party piece of software which does not provide an equivalent of an OnItemDataBound event for me to be able to inject the div into the item Does anyone know of a way, using JQuery, of showing the div but maintain the display of the div as the mouse focus leaves the li that originaly displayed the div and actually enters the div? I'm currently using the following JQuery which displays the div correctly but as the mouse focus enters the div the div then disappears as the mouse focus from the li has now moved: $(document).ready(function() { function addMega(){ $(".sectorsDropped").toggle("fast"); } function removeMega(){ $(".sectorsDropped").toggle("fast"); } var megaConfig = { interval: 500, sensitivity: 4, over: addMega, timeout: 500, out: removeMega }; $("#TopNavMenu_0_1").hoverIntent(megaConfig) }); Thanks Dave

Read the article
CSS layout problem on Firefox with filling space between end of left column and footer

- by Jean

Basically, the left column is supposed to extend to the footer with the continuous red color. However, in Firefox on pages with lots of text, the column does not extend to the footer and leaves a large white gap--see site: http://library.luhs.org/JHSII/about.html I've tried readjusting the heights, creating the sticky footer, and other things I've read about on this site. So I admit that I'm stumped, and what's really odd is that the layout seems to work in IE as there is no white space! I didn't create the site, but I recently inherited it and trying to work through the mess Any help is much appreciated, here's the CSS #html,body{ margin:0; padding:0; border:0; height:100%; } #body{ background:#ffffff; min-width:965px; text-align:center; width: 600px; font: Geneva, Arial, Helvetica, sans-serif; } #.style7{ clear:both; height:1px; overflow:hidden; line-height:1%; font-size:0px; margin-bottom:-1px; } #fullheightcontainer{ margin-left:auto; margin-right:auto; text-align:left; position:relative; width:965px; height:100%; } #wrapper{ min-height:100%; height:100%; background:#660000; background-color: #660000; background-repeat: repeat; } #wrapp\65 r{ height:auto; } # html wrapper{ height:100%; } #outer{ z-index:1; position:relative; margin-left:150px; width:815px; background:#FFFFFF; height:100%; background-color: #FFFFFF; } #left{ width:151px; float:left; display:inline; position:relative; margin-left:-150px; } padding: 20px; border: 0; margin: 0 0 0 240px *>html #left{width:150px;} #container-left{ width:150px; color: #CCCCCC; } * html #left{margin-right:-3px;} #center{ width:800px; float:right; display:inline; margin-left:-1px; } #clearheadercenter{ height:125px; overflow:hidden; } #clearfootercenter{ height:50px; overflow:hidden; } #footer{ z-index:1; position:relative; clear: both; width:965px; height:50px; overflow:hidden; margin-top:-50px; background-color: #660000; } #subfooter1{ background:#FFFFCC; text-align:left; margin-left:150px; height:50px; } #header{ z-index:1; position:absolute; top:0px; width:815px; margin-left:150px; height:100px; overflow:hidden; background-color: #660000; } #subheader1{ background:#FFFFCC; text-align:center; height:70px; } #gfx_bg_middle{ top:0px; position:absolute; height:100%; overflow:hidden; width:815px; margin-left:150px; background:#FFFFFF; } # html #gfx_bg_middle{ display:none; } #floatingnav { margin: 5px 10px 5px 5px; padding: 0px 5px 5px; float: right; font: .75em/1.35em Geneva, Arial, Helvetica, sans-serif; height: 600px; width: 300px; } #floatingnav a { color: #630; } #floatingnav ul { margin-top: -5; } #.floatright { float: right; margin: 0 0 10px 10px; border: 1px solid #666; padding: 2px; } #outer{ word-wrap:break-word; } #table.s1 { border-width: medium; border-spacing: 2px; border-style: none; border-color: rgb(85, 0, 0); border-collapse: collapse; background-color: white; } #table.s1 th { border-width: medium; padding: 2px; border-style: groove; border-color: red; background-color: white; -moz-border-radius: 0px 0px 0px 0px; } #table.s1 td { border-width: medium; padding: 2px; border-style: groove; border-color: #660000; background-color: #FFFFFF; -moz-border-radius: 0px 0px 0px 0px; } #a:link { color: #000066; } #a:visited { color: #000066; } #p.sample { font-family: serif; font-style: normal; font-variant: normal; font-weight: normal; font-size: medium; line-height: 100%; word-spacing: normal; letter-spacing: normal; text-decoration: none; text-transform: none; text-align: left; text-indent: 0ex; }

Read the article
XSD and plain text

- by Paul Knopf

I have a rest/xml service that gives me the following... <verse-unit unit-id="38009001"> <marker class="begin-verse" mid="v38009001"/> <begin-chapter num="9"/><heading>Judgment on Israel's Enemies</heading> <begin-block-indent/> <begin-paragraph class="line-group"/> <begin-line/><verse-num begin-chapter="9">1</verse-num>The burden of the word of the <span class="divine-name">Lord</span> is against the land of Hadrach<end-line class="br"/> <begin-line class="indent"/>and Damascus is its resting place.<end-line class="br"/> <begin-line/>For the <span class="divine-name">Lord</span> has an eye on mankind<end-line class="br"/> <begin-line class="indent"/>and on all the tribes of Israel,<footnote id="f1"> A slight emendation yields <i> For to the <span class="divine-name">Lord</span> belongs the capital of Syria and all the tribes of Israel </i> </footnote><end-line class="br"/> </verse-unit> I used visual studio to generate a schema from this and used XSD.EXE to generate classes that I can use to deserialize this mess into programmable stuff. I got everything to work and it is deserialized perfectly (almost). The problem I have is with the random text mixed throughout the child nodes. The generated verse-unit objects gives me a list of objects (begin-line, begin-block-indent, etc), and also another list of string objects that represent the bits of string throughout the xml. Here is my schema <xs:element maxOccurs="unbounded" name="verse-unit"> <xs:complexType mixed="true"> <xs:sequence> <xs:choice maxOccurs="unbounded"> <xs:element name="marker"> <xs:complexType> <xs:attribute name="class" type="xs:string" use="required" /> <xs:attribute name="mid" type="xs:string" use="required" /> </xs:complexType> </xs:element> <xs:element name="begin-chapter"> <xs:complexType> <xs:attribute name="num" type="xs:unsignedByte" use="required" /> </xs:complexType> </xs:element> <xs:element name="heading"> <xs:complexType mixed="true"> <xs:sequence minOccurs="0"> <xs:element name="span"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="class" type="xs:string" use="required" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="begin-block-indent" /> <xs:element name="begin-paragraph"> <xs:complexType> <xs:attribute name="class" type="xs:string" use="required" /> </xs:complexType> </xs:element> <xs:element name="begin-line"> <xs:complexType> <xs:attribute name="class" type="xs:string" use="optional" /> </xs:complexType> </xs:element> <xs:element name="verse-num"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:unsignedByte"> <xs:attribute name="begin-chapter" type="xs:unsignedByte" use="optional" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="end-line"> <xs:complexType> <xs:attribute name="class" type="xs:string" use="optional" /> </xs:complexType> </xs:element> <xs:element name="end-paragraph" /> <xs:element name="end-block-indent" /> <xs:element name="end-chapter" /> </xs:choice> </xs:sequence> <xs:attribute name="unit-id" type="xs:unsignedInt" use="required" /> </xs:complexType> </xs:element> WHAT I NEED IS THIS. I need the random text that is NOT surrounded by an xml node to be represented by an object so I know the order that everything is in. I know this is complicated, so let me try to simplify it. <field name="test_field_0"> Some text I'm sure you don't want. <subfield>Some text.</subfield> More text you don't want. </field> I need the xsd to generate a field object with items that can have either a text object, or a subfield object. I need to no where the random text is within the child nodes.

Read the article
CSS Positioning

- by Davey

Trying to mess with this wordpress theme and can't figure out why the sidebar is stacking underneath the content block. Any help would be very appreciated. http://www.buffalostreetbooks.com/events CSS: body { font-family: Arial, Helvetica, Verdana, Sans-serif; font-size: 10pt; background-color: #692022; background-image:url("http://www.buffalostreetbooks.com/wp-content/themes/autumn-leaves/images/repeatflower.png"); } body,h1#blog-title { margin: 0; padding: 0; } a { color: blue; } a:hover { color: #FF8C00; } a img { border: 0 none; } #wrapper { width: 960px; margin: 0 auto; background-color: #F4FBF4; border-left: 1px solid #ccc; border-right: 1px solid #ccc; } #header { background-image:url("http://www.buffalostreetbooks.com/wp-content/themes/autumn-leaves/images/headertime.png"); width:768px; height: 200px; } #inner-header { padding: 125px 1em 0; } h1#blog-title { font-size: 2em; } h1#blog-title a { color: #800000; } .entry-title a { color: #CD853F; } h1#blog-title a, .entry-title a, #footer a { text-decoration: none; } h1#blog-title a:hover, .entry-title a:hover, #footer a:hover { text-decoration: underline; } div.skip-link { display: none; } #menu { border-bottom: 1px solid #ccc; } #menu a { color: #000; } #menu a:hover { text-decoration: underline; } #menu li.current_page_item a, #menu li.current_page_item a:hover { background-color: #DFC28B; text-decoration: none; } #content { padding: 1em; width:600px; } .entry-title { font-size: 1.5em; margin: 1em 0 0 0; } abbr.published { color: #666; border: 0 none; } .entry-meta, .entry-date { color: #666; } #comments-list .avatar { float: left; margin-right: 1em; } #comments-list .n { font-weight: bold; } .entry-meta, .comment-meta { font-style: italic; } #comments-list p { clear: left; } #primary { padding-left: 1em; font-size: 0.9em; border-left: 1px solid #ccc; border-bottom: 1px solid #ccc; background-color: #FFFACD; } #footer { text-align: center; font-size: 0.8em; border-top: 1px solid #ccc; border-bottom: 1px solid #ccc; margin-bottom: 1em; } #inner-footer { padding: 1em 0; } .entry-meta, .entry-meta a, .comment-meta, .comment-meta a, .sidebar, .sidebar a, #footer, #footer a { color: #666; } /* LAYOUT: Two-Column (Right) DESCRIPTION: Two-column fluid layout with one sidebars right of content */ div#container { margin:0 0 0 0; width:960px; height:100%; } div#content { margin:0 0 0 0; } div.sidebar { overflow:hidden; width:280px; min-height:500px; clear:both; } div#secondary { clear:right; } div#footer { clear:both; width:100%; } /* Just some example content */ div#menu { height:2em; width:100%; } div#menu ul,div#menu ul ul { line-height:2em; list-style:none; margin:0; padding:0; } div#menu ul a { display:block; margin-right:1em; padding:0 0.5em; text-decoration:none; } div#menu ul ul ul a { font-style:italic; } div#menu ul li ul { left:-999em; position:absolute; } div#menu ul li:hover ul { left:auto; } .entry-title,.entry-meta { clear:both; } div#primary { } form#commentform .form-label { margin:1em 0 0; } form#commentform span.required { background:#fff; color:#c30; } form#commentform,form#commentform p { padding:0; } input#author,input#email,input#url,textarea#comme nt { padding:0.2em; } div.comments ol li { margin:0 0 3.5em; } textarea#comment { height:13em; margin:0 0 0.5em; overflow:auto; width:66%; } .alignright,img.alignright{ float:right; margin:1em 0 0 1em; } .alignleft,img.alignleft{ float:left; margin:1em 1em 0 0; } .aligncenter,img.aligncenter{ display:block; margin:1em auto; text-align:center; } div.gallery { clear:both; height:180px; margin:1em 0; width:100%; } p.wp-caption-text{ font-style:italic; } div.gallery dl{ margin:1em auto; overflow:hidden; text-align:center; } div.gallery dl.gallery-columns-1 { width:100%; } div.gallery dl.gallery-columns-2 { width:49%; } div.gallery dl.gallery-columns-3 { width:33%; } div.gallery dl.gallery-columns-4 { width:24%; } div.gallery dl.gallery-columns-5 { width:19%; } div#nav-above { margin-bottom:1em; } div#nav-below { margin-top:1em; } div#nav-images { height:150px; margin:1em 0; } div.navigation { height:1.25em; } div.navigation div.nav-next { float:right; text-align:right; } div.sidebar h3 { font-size:1.2em; } div.sidebar input#s { width:7em; } div.sidebar li { list-style:none; margin:0 0 2em; } div.sidebar li form { margin:0.2em 0 0; padding:0; } div.sidebar ul ul { margin:0 0 0 2em; } div.sidebar ul ul li { list-style:disc; margin:0; } div.sidebar ul ul ul { margin:0 0 0 0.5em; } div.sidebar ul ul ul li { list-style:circle; } div#menu ul li,div.gallery dl,div.navigation div.nav-previous { float:left; } input#author,input#email,input#url,div.navigation div { width:50%; } div.gallery *,div.sidebar div,div.sidebar h3,div.sidebar ul { margin:0; padding:0; }

Read the article
Little more help with writing a o buffer with libjpeg

- by Richard Knop

So I have managed to find another question discussing how to use the libjpeg to compress an image to jpeg. I have found this code which is supposed to work: Compressing IplImage to JPEG using libjpeg in OpenCV Here's the code (it compiles ok): /* This a custom destination manager for jpeglib that enables the use of memory to memory compression. See IJG documentation for details. */ typedef struct { struct jpeg_destination_mgr pub; /* base class */ JOCTET* buffer; /* buffer start address */ int bufsize; /* size of buffer */ size_t datasize; /* final size of compressed data */ int* outsize; /* user pointer to datasize */ int errcount; /* counts up write errors due to buffer overruns */ } memory_destination_mgr; typedef memory_destination_mgr* mem_dest_ptr; /* ------------------------------------------------------------- */ /* MEMORY DESTINATION INTERFACE METHODS */ /* ------------------------------------------------------------- */ /* This function is called by the library before any data gets written */ METHODDEF(void) init_destination (j_compress_ptr cinfo) { mem_dest_ptr dest = (mem_dest_ptr)cinfo->dest; dest->pub.next_output_byte = dest->buffer; /* set destination buffer */ dest->pub.free_in_buffer = dest->bufsize; /* input buffer size */ dest->datasize = 0; /* reset output size */ dest->errcount = 0; /* reset error count */ } /* This function is called by the library if the buffer fills up I just reset destination pointer and buffer size here. Note that this behavior, while preventing seg faults will lead to invalid output streams as data is over- written. */ METHODDEF(boolean) empty_output_buffer (j_compress_ptr cinfo) { mem_dest_ptr dest = (mem_dest_ptr)cinfo->dest; dest->pub.next_output_byte = dest->buffer; dest->pub.free_in_buffer = dest->bufsize; ++dest->errcount; /* need to increase error count */ return TRUE; } /* Usually the library wants to flush output here. I will calculate output buffer size here. Note that results become incorrect, once empty_output_buffer was called. This situation is notified by errcount. */ METHODDEF(void) term_destination (j_compress_ptr cinfo) { mem_dest_ptr dest = (mem_dest_ptr)cinfo->dest; dest->datasize = dest->bufsize - dest->pub.free_in_buffer; if (dest->outsize) *dest->outsize += (int)dest->datasize; } /* Override the default destination manager initialization provided by jpeglib. Since we want to use memory-to-memory compression, we need to use our own destination manager. */ GLOBAL(void) jpeg_memory_dest (j_compress_ptr cinfo, JOCTET* buffer, int bufsize, int* outsize) { mem_dest_ptr dest; /* first call for this instance - need to setup */ if (cinfo->dest == 0) { cinfo->dest = (struct jpeg_destination_mgr *) (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_PERMANENT, sizeof (memory_destination_mgr)); } dest = (mem_dest_ptr) cinfo->dest; dest->bufsize = bufsize; dest->buffer = buffer; dest->outsize = outsize; /* set method callbacks */ dest->pub.init_destination = init_destination; dest->pub.empty_output_buffer = empty_output_buffer; dest->pub.term_destination = term_destination; } /* ------------------------------------------------------------- */ /* MEMORY SOURCE INTERFACE METHODS */ /* ------------------------------------------------------------- */ /* Called before data is read */ METHODDEF(void) init_source (j_decompress_ptr dinfo) { /* nothing to do here, really. I mean. I'm not lazy or something, but... we're actually through here. */ } /* Called if the decoder wants some bytes that we cannot provide... */ METHODDEF(boolean) fill_input_buffer (j_decompress_ptr dinfo) { /* we can't do anything about this. This might happen if the provided buffer is either invalid with regards to its content or just a to small bufsize has been given. */ /* fail. */ return FALSE; } /* From IJG docs: "it's not clear that being smart is worth much trouble" So I save myself some trouble by ignoring this bit. */ METHODDEF(void) skip_input_data (j_decompress_ptr dinfo, INT32 num_bytes) { /* There might be more data to skip than available in buffer. This clearly is an error, so screw this mess. */ if ((size_t)num_bytes > dinfo->src->bytes_in_buffer) { dinfo->src->next_input_byte = 0; /* no buffer byte */ dinfo->src->bytes_in_buffer = 0; /* no input left */ } else { dinfo->src->next_input_byte += num_bytes; dinfo->src->bytes_in_buffer -= num_bytes; } } /* Finished with decompression */ METHODDEF(void) term_source (j_decompress_ptr dinfo) { /* Again. Absolute laziness. Nothing to do here. Boring. */ } GLOBAL(void) jpeg_memory_src (j_decompress_ptr dinfo, unsigned char* buffer, size_t size) { struct jpeg_source_mgr* src; /* first call for this instance - need to setup */ if (dinfo->src == 0) { dinfo->src = (struct jpeg_source_mgr *) (*dinfo->mem->alloc_small) ((j_common_ptr) dinfo, JPOOL_PERMANENT, sizeof (struct jpeg_source_mgr)); } src = dinfo->src; src->next_input_byte = buffer; src->bytes_in_buffer = size; src->init_source = init_source; src->fill_input_buffer = fill_input_buffer; src->skip_input_data = skip_input_data; src->term_source = term_source; /* IJG recommend to use their function - as I don't know **** about how to do better, I follow this recommendation */ src->resync_to_restart = jpeg_resync_to_restart; } All I need to do is replace the jpeg_stdio_dest in my program with this code: int numBytes = 0; //size of jpeg after compression char * storage = new char[150000]; //storage buffer JOCTET *jpgbuff = (JOCTET*)storage; //JOCTET pointer to buffer jpeg_memory_dest(&cinfo,jpgbuff,150000,&numBytes); So I need some help to incorporate the above four lines into this function which now works but writes to a file instead of a memory: int write_jpeg_file( char *filename ) { struct jpeg_compress_struct cinfo; struct jpeg_error_mgr jerr; /* this is a pointer to one row of image data */ JSAMPROW row_pointer[1]; FILE *outfile = fopen( filename, "wb" ); if ( !outfile ) { printf("Error opening output jpeg file %s\n!", filename ); return -1; } cinfo.err = jpeg_std_error( &jerr ); jpeg_create_compress(&cinfo); jpeg_stdio_dest(&cinfo, outfile); /* Setting the parameters of the output file here */ cinfo.image_width = width; cinfo.image_height = height; cinfo.input_components = bytes_per_pixel; cinfo.in_color_space = color_space; /* default compression parameters, we shouldn't be worried about these */ jpeg_set_defaults( &cinfo ); /* Now do the compression .. */ jpeg_start_compress( &cinfo, TRUE ); /* like reading a file, this time write one row at a time */ while( cinfo.next_scanline < cinfo.image_height ) { row_pointer[0] = &raw_image[ cinfo.next_scanline * cinfo.image_width * cinfo.input_components]; jpeg_write_scanlines( &cinfo, row_pointer, 1 ); } /* similar to read file, clean up after we're done compressing */ jpeg_finish_compress( &cinfo ); jpeg_destroy_compress( &cinfo ); fclose( outfile ); /* success code is 1! */ return 1; } Anybody could help me out a bit with it? I've tried meddling with it but I am not sure how to do it. I I just replace this line: jpeg_stdio_dest(&cinfo, outfile); It's not going to work. There is more stuff that needs to be changed a bit in that function and I am being a little lost from all those pointers and memory management.

Read the article
Red Gate Coder interviews: Alex Davies

- by Michael Williamson

Alex Davies has been a software engineer at Red Gate since graduating from university, and is currently busy working on .NET Demon. We talked about tackling parallel programming with his actors framework, a scientific approach to debugging, and how JavaScript is going to affect the programming languages we use in years to come. So, if we start at the start, how did you get started in programming? When I was seven or eight, I was given a BBC Micro for Christmas. I had asked for a Game Boy, but my dad thought it would be better to give me a proper computer. For a year or so, I only played games on it, but then I found the user guide for writing programs in it. I gradually started doing more stuff on it and found it fun. I liked creating. As I went into senior school I continued to write stuff on there, trying to write games that weren’t very good. I got a real computer when I was fourteen and found ways to write BASIC on it. Visual Basic to start with, and then something more interesting than that. How did you learn to program? Was there someone helping you out? Absolutely not! I learnt out of a book, or by experimenting. I remember the first time I found a loop, I was like “Oh my God! I don’t have to write out the same line over and over and over again any more. It’s amazing!” When did you think this might be something that you actually wanted to do as a career? For a long time, I thought it wasn’t something that you would do as a career, because it was too much fun to be a career. I thought I’d do chemistry at university and some kind of career based on chemical engineering. And then I went to a careers fair at school when I was seventeen or eighteen, and it just didn’t interest me whatsoever. I thought “I could be a programmer, and there’s loads of money there, and I’m good at it, and it’s fun”, but also that I shouldn’t spoil my hobby. Now I don’t really program in my spare time any more, which is a bit of a shame, but I program all the rest of the time, so I can live with it. Do you think you learnt much about programming at university? Yes, definitely! I went into university knowing how to make computers do anything I wanted them to do. However, I didn’t have the language to talk about algorithms, so the algorithms course in my first year was massively important. Learning other language paradigms like functional programming was really good for breadth of understanding. Functional programming influences normal programming through design rather than actually using it all the time. I draw inspiration from it to write imperative programs which I think is actually becoming really fashionable now, but I’ve been doing it for ages. I did it first! There were also some courses on really odd programming languages, a bit of Prolog, a little bit of C. Having a little bit of each of those is something that I would have never done on my own, so it was important. And then there are knowledge-based courses which are about not programming itself but things that have been programmed like TCP. Those are really important for examples for how to approach things. Did you do any internships while you were at university? Yeah, I spent both of my summers at the same company. I thought I could code well before I went there. Looking back at the crap that I produced, it was only surpassed in its crappiness by all of the other code already in that company. I’m so much better at writing nice code now than I used to be back then. Was there just not a culture of looking after your code? There was, they just didn’t hire people for their abilities in that area. They hired people for raw IQ. The first indicator of it going wrong was that they didn’t have any computer scientists, which is a bit odd in a programming company. But even beyond that they didn’t have people who learnt architecture from anyone else. Most of them had started straight out of university, so never really had experience or mentors to learn from. There wasn’t the experience to draw from to teach each other. In the second half of my second internship, I was being given tasks like looking at new technologies and teaching people stuff. Interns shouldn’t be teaching people how to do their jobs! All interns are going to have little nuggets of things that you don’t know about, but they shouldn’t consistently be the ones who know the most. It’s not a good environment to learn. I was going to ask how you found working with people who were more experienced than you… When I reached Red Gate, I found some people who were more experienced programmers than me, and that was difficult. I’ve been coding since I was tiny. At university there were people who were cleverer than me, but there weren’t very many who were more experienced programmers than me. During my internship, I didn’t find anyone who I classed as being a noticeably more experienced programmer than me. So, it was a shock to the system to have valid criticisms rather than just formatting criticisms. However, Red Gate’s not so big on the actual code review, at least it wasn’t when I started. We did an entire product release and then somebody looked over all of the UI of that product which I’d written and say what they didn’t like. By that point, it was way too late and I’d disagree with them. Do you think the lack of code reviews was a bad thing? I think if there’s going to be any oversight of new people, then it should be continuous rather than chunky. For me I don’t mind too much, I could go out and get oversight if I wanted it, and in those situations I felt comfortable without it. If I was managing the new person, then maybe I’d be keener on oversight and then the right way to do it is continuously and in very, very small chunks. Have you had any significant projects you’ve worked on outside of a job? When I was a teenager I wrote all sorts of stuff. I used to write games, I derived how to do isomorphic projections myself once. I didn’t know what the word was so I couldn’t Google for it, so I worked it out myself. It was horrifically complicated. But it sort of tailed off when I started at university, and is now basically zero. If I do side-projects now, they tend to be work-related side projects like my actors framework, NAct, which I started in a down tools week. Could you explain a little more about NAct? It is a little C# framework for writing parallel code more easily. Parallel programming is difficult when you need to write to shared data. Sometimes parallel programming is easy because you don’t need to write to shared data. When you do need to access shared data, you could just have your threads pile in and do their work, but then you would screw up the data because the threads would trample on each other’s toes. You could lock, but locks are really dangerous if you’re using more than one of them. You get interactions like deadlocks, and that’s just nasty. Actors instead allows you to say this piece of data belongs to this thread of execution, and nobody else can read it. If you want to read it, then ask that thread of execution for a piece of it by sending a message, and it will send the data back by a message. And that avoids deadlocks as long as you follow some obvious rules about not making your actors sit around waiting for other actors to do something. There are lots of ways to write actors, NAct allows you to do it as if it was method calls on other objects, which means you get all the strong type-safety that C# programmers like. Do you think that this is suitable for the majority of parallel programming, or do you think it’s only suitable for specific cases? It’s suitable for most difficult parallel programming. If you’ve just got a hundred web requests which are all independent of each other, then I wouldn’t bother because it’s easier to just spin them up in separate threads and they can proceed independently of each other. But where you’ve got difficult parallel programming, where you’ve got multiple threads accessing multiple bits of data in multiple ways at different times, then actors is at least as good as all other ways, and is, I reckon, easier to think about. When you’re using actors, you presumably still have to write your code in a different way from you would otherwise using single-threaded code. You can’t use actors with any methods that have return types, because you’re not allowed to call into another actor and wait for it. If you want to get a piece of data out of another actor, then you’ve got to use tasks so that you can use “async” and “await” to await asynchronously for it. But other than that, you can still stick things in classes so it’s not too different really. Rather than having thousands of objects with mutable state, you can use component-orientated design, where there are only a few mutable classes which each have a small number of instances. Then there can be thousands of immutable objects. If you tend to do that anyway, then actors isn’t much of a jump. If I’ve already built my system without any parallelism, how hard is it to add actors to exploit all eight cores on my desktop? Usually pretty easy. If you can identify even one boundary where things look like messages and you have components where some objects live on one side and these other objects live on the other side, then you can have a granddaddy object on one side be an actor and it will parallelise as it goes across that boundary. Not too difficult. If we do get 1000-core desktop PCs, do you think actors will scale up? It’s hard. There are always in the order of twenty to fifty actors in my whole program because I tend to write each component as actors, and I tend to have one instance of each component. So this won’t scale to a thousand cores. What you can do is write data structures out of actors. I use dictionaries all over the place, and if you need a dictionary that is going to be accessed concurrently, then you could build one of those out of actors in no time. You can use queuing to marshal requests between different slices of the dictionary which are living on different threads. So it’s like a distributed hash table but all of the chunks of it are on the same machine. That means that each of these thousand processors has cached one small piece of the dictionary. I reckon it wouldn’t be too big a leap to start doing proper parallelism. Do you think it helps if actors get baked into the language, similarly to Erlang? Erlang is excellent in that it has thread-local garbage collection. C# doesn’t, so there’s a limit to how well C# actors can possibly scale because there’s a single garbage collected heap shared between all of them. When you do a global garbage collection, you’ve got to stop all of the actors, which is seriously expensive, whereas in Erlang garbage collections happen per-actor, so they’re insanely cheap. However, Erlang deviated from all the sensible language design that people have used recently and has just come up with crazy stuff. You can definitely retrofit thread-local garbage collection to .NET, and then it’s quite well-suited to support actors, even if it’s not baked into the language. Speaking of language design, do you have a favourite programming language? I’ll choose a language which I’ve never written before. I like the idea of Scala. It sounds like C#, only with some of the niggles gone. I enjoy writing static types. It means you don’t have to writing tests so much. When you say it doesn’t have some of the niggles? C# doesn’t allow the use of a property as a method group. It doesn’t have Scala case classes, or sum types, where you can do a switch statement and the compiler checks that you’ve checked all the cases, which is really useful in functional-style programming. Pattern-matching, in other words. That’s actually the major niggle. C# is pretty good, and I’m quite happy with C#. And what about going even further with the type system to remove the need for tests to something like Haskell? Or is that a step too far? I’m quite a pragmatist, I don’t think I could deal with trying to write big systems in languages with too few other users, especially when learning how to structure things. I just don’t know anyone who can teach me, and the Internet won’t teach me. That’s the main reason I wouldn’t use it. If I turned up at a company that writes big systems in Haskell, I would have no objection to that, but I wouldn’t instigate it. What about things in C#? For instance, there’s contracts in C#, so you can try to statically verify a bit more about your code. Do you think that’s useful, or just not worthwhile? I’ve not really tried it. My hunch is that it needs to be built into the language and be quite mathematical for it to work in real life, and that doesn’t seem to have ended up true for C# contracts. I don’t think anyone who’s tried them thinks they’re any good. I might be wrong. On a slightly different note, how do you like to debug code? I think I’m quite an odd debugger. I use guesswork extremely rarely, especially if something seems quite difficult to debug. I’ve been bitten spending hours and hours on guesswork and not being scientific about debugging in the past, so now I’m scientific to a fault. What I want is to see the bug happening in the debugger, to step through the bug happening. To watch the program going from a valid state to an invalid state. When there’s a bug and I can’t work out why it’s happening, I try to find some piece of evidence which places the bug in one section of the code. From that experiment, I binary chop on the possible causes of the bug. I suppose that means binary chopping on places in the code, or binary chopping on a stage through a processing cycle. Basically, I’m very stupid about how I debug. I won’t make any guesses, I won’t use any intuition, I will only identify the experiment that’s going to binary chop most effectively and repeat rather than trying to guess anything. I suppose it’s quite top-down. Is most of the time then spent in the debugger? Absolutely, if at all possible I will never debug using print statements or logs. I don’t really hold much stock in outputting logs. If there’s any bug which can be reproduced locally, I’d rather do it in the debugger than outputting logs. And with SmartAssembly error reporting, there’s not a lot that can’t be either observed in an error report and just fixed, or reproduced locally. And in those other situations, maybe I’ll use logs. But I hate using logs. You stare at the log, trying to guess what’s going on, and that’s exactly what I don’t like doing. You have to just look at it and see does this look right or wrong. We’ve covered how you get to grip with bugs. How do you get to grips with an entire codebase? I watch it in the debugger. I find little bugs and then try to fix them, and mostly do it by watching them in the debugger and gradually getting an understanding of how the code works using my process of binary chopping. I have to do a lot of reading and watching code to choose where my slicing-in-half experiment is going to be. The last time I did it was SmartAssembly. The old code was a complete mess, but at least it did things top to bottom. There wasn’t too much of some of the big abstractions where flow of control goes all over the place, into a base class and back again. Code’s really hard to understand when that happens. So I like to choose a little bug and try to fix it, and choose a bigger bug and try to fix it. Definitely learn by doing. I want to always have an aim so that I get a little achievement after every few hours of debugging. Once I’ve learnt the codebase I might be able to fix all the bugs in an hour, but I’d rather be using them as an aim while I’m learning the codebase. If I was a maintainer of a codebase, what should I do to make it as easy as possible for you to understand? Keep distinct concepts in different places. And name your stuff so that it’s obvious which concepts live there. You shouldn’t have some variable that gets set miles up the top of somewhere, and then is read miles down to choose some later behaviour. I’m talking from a very much SmartAssembly point of view because the old SmartAssembly codebase had tons and tons of these things, where it would read some property of the code and then deal with it later. Just thousands of variables in scope. Loads of things to think about. If you can keep concepts separate, then it aids me in my process of fixing bugs one at a time, because each bug is going to more or less be understandable in the one place where it is. And what about tests? Do you think they help at all? I’ve never had the opportunity to learn a codebase which has had tests, I don’t know what it’s like! What about when you’re actually developing? How useful do you find tests in finding bugs or regressions? Finding regressions, absolutely. Running bits of code that would be quite hard to run otherwise, definitely. It doesn’t happen very often that a test finds a bug in the first place. I don’t really buy nebulous promises like tests being a good way to think about the spec of the code. My thinking goes something like “This code works at the moment, great, ship it! Ah, there’s a way that this code doesn’t work. Okay, write a test, demonstrate that it doesn’t work, fix it, use the test to demonstrate that it’s now fixed, and keep the test for future regressions.” The most valuable tests are for bugs that have actually happened at some point, because bugs that have actually happened at some point, despite the fact that you think you’ve fixed them, are way more likely to appear again than new bugs are. Does that mean that when you write your code the first time, there are no tests? Often. The chance of there being a bug in a new feature is relatively unaffected by whether I’ve written a test for that new feature because I’m not good enough at writing tests to think of bugs that I would have written into the code. So not writing regression tests for all of your code hasn’t affected you too badly? There are different kinds of features. Some of them just always work, and are just not flaky, they just continue working whatever you throw at them. Maybe because the type-checker is particularly effective around them. Writing tests for those features which just tend to always work is a waste of time. And because it’s a waste of time I’ll tend to wait until a feature has demonstrated its flakiness by having bugs in it before I start trying to test it. You can get a feel for whether it’s going to be flaky code as you’re writing it. I try to write it to make it not flaky, but there are some things that are just inherently flaky. And very occasionally, I’ll think “this is going to be flaky” as I’m writing, and then maybe do a test, but not most of the time. How do you think your programming style has changed over time? I’ve got clearer about what the right way of doing things is. I used to flip-flop a lot between different ideas. Five years ago I came up with some really good ideas and some really terrible ideas. All of them seemed great when I thought of them, but they were quite diverse ideas, whereas now I have a smaller set of reliable ideas that are actually good for structuring code. So my code is probably more similar to itself than it used to be back in the day, when I was trying stuff out. I’ve got more disciplined about encapsulation, I think. There are operational things like I use actors more now than I used to, and that forces me to use immutability more than I used to. The first code that I wrote in Red Gate was the memory profiler UI, and that was an actor, I just didn’t know the name of it at the time. I don’t really use object-orientation. By object-orientation, I mean having n objects of the same type which are mutable. I want a constant number of objects that are mutable, and they should be different types. I stick stuff in dictionaries and then have one thing that owns the dictionary and puts stuff in and out of it. That’s definitely a pattern that I’ve seen recently. I think maybe I’m doing functional programming. Possibly. It’s plausible. If you had to summarise the essence of programming in a pithy sentence, how would you do it? Programming is the form of art that, without losing any of the beauty of architecture or fine art, allows you to produce things that people love and you make money from. So you think it’s an art rather than a science? It’s a little bit of engineering, a smidgeon of maths, but it’s not science. Like architecture, programming is on that boundary between art and engineering. If you want to do it really nicely, it’s mostly art. You can get away with doing architecture and programming entirely by having a good engineering mind, but you’re not going to produce anything nice. You’re not going to have joy doing it if you’re an engineering mind. Architects who are just engineering minds are not going to enjoy their job. I suppose engineering is the foundation on which you build the art. Exactly. How do you think programming is going to change over the next ten years? There will be an unfortunate shift towards dynamically-typed languages, because of JavaScript. JavaScript has an unfair advantage. JavaScript’s unfair advantage will cause more people to be exposed to dynamically-typed languages, which means other dynamically-typed languages crop up and the best features go into dynamically-typed languages. Then people conflate the good features with the fact that it’s dynamically-typed, and more investment goes into dynamically-typed languages. They end up better, so people use them. What about the idea of compiling other languages, possibly statically-typed, to JavaScript? It’s a reasonable idea. I would like to do it, but I don’t think enough people in the world are going to do it to make it pick up. The hordes of beginners are the lifeblood of a language community. They are what makes there be good tools and what makes there be vibrant community websites. And any particular thing which is the same as JavaScript only with extra stuff added to it, although it might be technically great, is not going to have the hordes of beginners. JavaScript is always to be quickest and easiest way for a beginner to start programming in the browser. And dynamically-typed languages are great for beginners. Compilers are pretty scary and beginners don’t write big code. And having your errors come up in the same place, whether they’re statically checkable errors or not, is quite nice for a beginner. If someone asked me to teach them some programming, I’d teach them JavaScript. If dynamically-typed languages are great for beginners, when do you think the benefits of static typing start to kick in? The value of having a statically typed program is in the tools that rely on the static types to produce a smooth IDE experience rather than actually telling me my compile errors. And only once you’re experienced enough a programmer that having a really smooth IDE experience makes a blind bit of difference, does static typing make a blind bit of difference. So it’s not really about size of codebase. If I go and write up a tiny program, I’m still going to get value out of writing it in C# using ReSharper because I’m experienced with C# and ReSharper enough to be able to write code five times faster if I have that help. Any other visions of the future? Nobody’s going to use actors. Because everyone’s going to be running on single-core VMs connected over network-ready protocols like JSON over HTTP. So, parallelism within one operating system is going to die. But until then, you should use actors. More Red Gater Coder interviews

Read the article
Why should you choose Oracle WebLogic 12c instead of JBoss EAP 6?

- by Ricardo Ferreira

In this post, I will cover some technical differences between Oracle WebLogic 12c and JBoss EAP 6, which was released a couple days ago from Red Hat. This article claims to help you in the evaluation of key points that you should consider when choosing for an Java EE application server. In the following sections, I will present to you some important aspects that most customers ask us when they are seriously evaluating for an middleware infrastructure, specially if you are considering JBoss for some reason. I would suggest that you keep the following question in mind while you are reading the points: "Why should I choose JBoss instead of WebLogic?" 1) Multi Datacenter Deployment and Clustering - D/R ("Disaster & Recovery") architecture support is embedded on the WebLogic Server 12c product. JBoss EAP 6 on the other hand has no direct D/R support included, Red Hat relies on third-part tools with higher prices. When you consider a middleware solution to host your business critical application, you should worry with every architectural aspect that are related with the solution. Fail-over support is one little aspect of a truly reliable solution. If you do not worry about D/R, your solution will not be reliable. Having said that, with Red Hat and JBoss EAP 6, you have this extra cost that will increase considerably the total cost of ownership of the solution. As we commonly hear from analysts, open-source are not so cheaper when you start seeing the big picture. - WebLogic Server 12c supports advanced LAN clustering, detection of death servers and have a common alert framework. JBoss EAP 6 on the other hand has limited LAN clustering support with no server death detection. They do not generate any alerts when servers goes down (only if you buy JBoss ON which is a separated technology, but until now does not support JBoss EAP 6) and manual intervention are required when servers goes down. In most cases, admin people must rely on "kill -9", "tail -f someFile.log" and "ps ax | grep java" commands to manage failures and clustering anomalies. - WebLogic Server 12c supports the concept of Node Manager, which is a separated process that runs on the physical | virtual servers that allows extend the administration of the cluster to WebLogic managed servers that are often distributed across multiple machines and geographic locations. JBoss EAP 6 on the other hand has no equivalent technology. Whole server instances must be managed individually. - WebLogic Server 12c Node Manager supports Coherence to boost performance when managing servers. JBoss EAP 6 on the other hand has no similar technology. There is no way to coordinate JBoss and infiniband instances provided by JBoss using high throughput and low latency protocols like InfiniBand. The Node Manager feature also allows another very important feature that JBoss EAP lacks: secure the administration. When using WebLogic Node Manager, all the administration tasks are sent to the managed servers in a secure tunel protected by a certificate, which means that the transport layer that separates the WebLogic administration console from the managed servers are secured by SSL. - WebLogic Server 12c are now integrated with OTD ("Oracle Traffic Director") which is a web server technology derived from the former Sun iPlanet Web Server. This software complements the web server support offered by OHS ("Oracle HTTP Server"). Using OTD, WebLogic instances are load-balanced by a high powerful software that knows how to handle SDP ("Socket Direct Protocol") over InfiniBand, which boost performance when used with engineered systems technologies like Oracle Exalogic Elastic Cloud. JBoss EAP 6 on the other hand only offers support to Apache Web Server with custom modules created to deal with JBoss clusters, but only across standard TCP/IP networks. 2) Application and Runtime Diagnostics - WebLogic Server 12c have diagnostics capabilities embedded on the server called WLDF ("WebLogic Diagnostic Framework") so there is no need to rely on third-part tools. JBoss EAP 6 on the other hand has no diagnostics capabilities. Their only diagnostics tool is the log generated by the application server. Admin people are encouraged to analyse thousands of log lines to find out what is going on. - WebLogic Server 12c complement WLDF with JRockit MC ("Mission Control"), which provides to administrators and developers a complete insight about the JVM performance, behavior and possible bottlenecks. WebLogic Server 12c also have an classloader analysis tool embedded, and even a log analyzer tool that enables administrators and developers to view logs of multiple servers at the same time. JBoss EAP 6 on the other hand relies on third-part tools to do something similar. Again, only log searching are offered to find out whats going on. - WebLogic Server 12c offers end-to-end traceability and monitoring available through Oracle EM ("Enterprise Manager"), including monitoring of business transactions that flows through web servers, ESBs, application servers and database servers, all of this with high deep JVM analysis and diagnostics. JBoss EAP 6 on the other hand, even using JBoss ON ("Operations Network"), which is a separated technology, does not support those features. Red Hat relies on third-part tools to provide direct Oracle database traceability across JVMs. One of those tools are Oracle EM for non-Oracle middleware that manage JBoss, Tomcat, Websphere and IIS transparently. - WebLogic Server 12c with their JRockit support offers a tool called JRockit Flight Recorder, which can give developers a complete visibility of a certain period of application production monitoring with zero extra overhead. This automatic recording allows you to deep analyse threads latency, memory leaks, thread contention, resource utilization, stack overflow damages and GC ("Garbage Collection") cycles, to observe in real time stop-the-world phenomenons, generational, reference count and parallel collects and mutator threads analysis. JBoss EAP 6 don't even dream to support something similar, even because they don't have their own JVM. 3) Application Server Administration - WebLogic Server 12c offers a complete administration console complemented with scripting and macro-like recording capabilities. A single WebLogic console can managed up to hundreds of WebLogic servers belonging to the same domain. JBoss EAP 6 on the other hand has a limited console and provides a XML centric administration. JBoss, after ten years, started the development of a rudimentary centralized administration that still leave a lot of administration tasks aside, so admin people and developers must touch scripts and XML configuration files for most advanced and even simple administration tasks. This lead applications to error prone and risky deployments. Even using JBoss ON, JBoss EAP are not able to offer decent administration features for admin people which must be high skilled in JBoss internal architecture and its managing capabilities. - Oracle EM is available to manage multiple domains, databases, application servers, operating systems and virtualization, with a complete end-to-end visibility. JBoss ON does not provide management capabilities across the complete architecture, only basic monitoring. Even deployment must be done aside JBoss ON which does no integrate well with others softwares than JBoss. Until now, JBoss ON does not supports JBoss EAP 6, so even their minimal support for JBoss are not available for JBoss EAP 6 leaving customers uncovered and subject to high skilled JBoss admin people. - WebLogic Server 12c has the same administration model whatever is the topology selected by the customer. JBoss EAP 6 on the other hand differentiates between two operational models: standalone-mode and domain-mode, that are not consistent with each other. Depending on the mode used, the administration skill is different. - WebLogic Server 12c has no point-of-failures processes, and it does not need to define any specialized server. Domain model in WebLogic is available for years (at least ten years or more) and is production proven. JBoss EAP 6 on the other hand needs special processes to garantee JBoss integrity, the PC ("Process-Controller") and the HC ("Host-Controller"). Different from WebLogic, the domain model in JBoss is quite new (one year at tops) of maturity, and need to mature considerably until start doing things like WebLogic domain model does. - WebLogic Server 12c supports parallel deployment model which enables some artifacts being deployed at the same time. JBoss EAP 6 on the other hand does not have any similar feature. Every deployment are done atomically in the containers. This means that if you have a huge EAR (an EAR of 120 MB of size for instance) and deploy onto JBoss EAP 6, this EAR will take some minutes in order to starting accept thread requests. The same EAR deployed onto WebLogic Server 12c will reduce the deployment time at least in 2X compared to JBoss. 4) Support and Upgrades - WebLogic Server 12c has patch management available. JBoss EAP 6 on the other hand has no patch management available, each JBoss EAP instance should be patched manually. To achieve such feature, you need to buy a separated technology called JBoss ON ("Operations Network") that manage this type of stuff. But until now, JBoss ON does not support JBoss EAP 6 so, in practice, JBoss EAP 6 does not have this feature. - WebLogic Server 12c supports previuous WebLogic domains without any reconfiguration since its kernel is robust and mature since its creation in 1995. JBoss EAP 6 on the other hand has a proven lack of supportability between JBoss AS 4, 5, 6 and 7. Different kernels and messaging engines were implemented in JBoss stack in the last five years reveling their incapacity to create a well architected and proven middleware technology. - WebLogic Server 12c has patch prescription based on customer configuration. JBoss EAP 6 on the other hand has no such capability. People need to create ticket supports and have their installations revised by Red Hat support guys to gain some patch prescription from them. - Oracle WebLogic Server independent of the version has 8 years of support of new patches and has lifetime release of existing patches beyond that. JBoss EAP 6 on the other hand provides patches for a specific application server version up to 5 years after the release date. JBoss EAP 4 and previous versions had only 4 years. A good question that Red Hat will argue to answer is: "what happens when you find issues after year 5"? 5) RAC ("Real Application Clusters") Support - WebLogic Server 12c ships with a specific JDBC driver to leverage Oracle RAC clustering capabilities (Fast-Application-Notification, Transaction Affinity, Fast-Connection-Failover, etc). Oracle JDBC thin driver are also available. JBoss EAP 6 on the other hand ships only the standard Oracle JDBC thin driver. Load balancing with Oracle RAC are not supported. Manual intervention in case of planned or unplanned RAC downtime are necessary. In JBoss EAP 6, situation does not reestablish automatically after downtime. - WebLogic Server 12c has a feature called Active GridLink for Oracle RAC which provides up to 3X performance on OLTP applications. This seamless integration between WebLogic and Oracle database enable more value added to critical business applications leveraging their investments in Oracle database technology and Oracle middleware. JBoss EAP 6 on the other hand has no performance gains at all, even when admin people implement some kind of connection-pooling tuning. - WebLogic Server 12c also supports transaction and web session affinity to the Oracle RAC, which provides aditional gains of performance. This is particularly interesting if you are creating a reliable solution that are distributed not only in an LAN cluster, but into a different data center. JBoss EAP 6 on the other hand has no such support. 6) Standards and Technology Support - WebLogic Server 12c is fully Java EE 6 compatible and production ready since december of 2011. JBoss EAP 6 on the other hand became fully compatible with Java EE 6 only in the community version after three months, and production ready only in a few days considering that this article was written in June of 2012. Red Hat says that they are the masters of innovation and technology proliferation, but compared with Oracle and even other proprietary vendors like IBM, they historically speaking are lazy to deliver the most newest technologies and standards adherence. - Oracle is the steward of Java, driving innovation into the platform from commercial and open-source vendors. Red Hat on the other hand does not have its own JVM and relies on third-part JVMs to complete their application server offer. 95% of Red Hat customers are using Oracle HotSpot as JVM, which means that without Oracle involvement, their support are limited exclusively to the application server layer and we all know that most problems are happens in the JVM layer. - WebLogic Server 12c supports natively JDK 7, which empower developers to explore the maximum of the Java platform productivity when writing code. This feature differentiate WebLogic from others application servers (except GlassFish that are also managed by Oracle) because the usage of JDK 7 introduce such remarkable productivity features like the "try-with-resources" enhancement, catching multiple exceptions with one try block, Strings in the switch statements, JVM improvements in terms of JDBC, I/O, networking, security, concurrency and of course, the most important feature of Java 7: native support for multiple non-Java languages. More features regarding JDK 7 can be found here. JBoss EAP 6 on the other hand does not support JDK 7 officially, they comment in their community version that "Java SE 7 can be used with JBoss 7" which does not gives you any guarantees of enterprise support for JDK 7. - Oracle WebLogic Server 12c supports integration with Spring framework allowing Spring applications to use WebLogic special transaction manager, exposing bean interfaces to WebLogic MBeans to take advantage of all WebLogic monitoring and administration advantages. JBoss EAP 6 on the other hand has no special integration with Spring. In fact, Red Hat offers a suspicious package called "JBoss Web Platform" that in theory supports Spring, but in practice this package does not offers any special integration. It is just a facility for Red Hat customers to have support from both JBoss and Spring technology using the same customer support. 7) Lightweight Development - Oracle WebLogic Server 12c and Oracle GlassFish are completely integrated and can share applications without any modifications. Starting with the 12c version, WebLogic now understands natively GlassFish deployment descriptors and specific configurations in order to offer you a truly and reliable migration path from a community Java EE application server to a enterprise middleware product like WebLogic. JBoss EAP 6 on the other hand has no support to natively reuse an existing (or still in development) application from JBoss AS community server. Users of JBoss suffer of critical issues during deployment time that includes: changing the libraries and dependencies of the application, patching the DTD or XSD deployment descriptors, refactoring of the application layers due classloading issues and anomalies, rebuilding of persistence, business and web layers due issues with "usage of the certified version of an certain dependency" or "frameworks that Red Hat potentially does not recommend" etc. If you have the culture or enterprise IT directive of developing Java EE applications using community middleware to in a certain future, transition to enterprise (supported by a vendor) middleware, Oracle WebLogic plus Oracle GlassFish offers you a more sustainable solution. - WebLogic Server 12c has a very light ZIP distribution (less than 165 MB). JBoss EAP 6 ZIP size is around 130 MB, together with JBoss ON you have more 100 MB resulting in a higher download footprint. This is particularly interesting if you plan to use automated setup of application server instances (for example, to rapidly setup a development or staging environment) using Maven or Hudson. - WebLogic Server 12c has a complete integration with Maven allowing developers to setup WebLogic domains with few commands. Tasks like downloading WebLogic, installation, domain creation, data sources deployment are completely integrated. JBoss EAP 6 on the other hand has a limited offer integration with those tools. - WebLogic Server 12c has a startup mode called WLX that turns-off EJB, JMS and JCA containers leaving enabled only the web container with Java EE 6 web profile. JBoss EAP 6 on the other hand has no such feature, you need to disable manually the containers that you do not want to use. - WebLogic Server 12c supports fastswap, which enables you to change classes without redeployment. This is particularly interesting if you are developing patches for the application that is already deployed and you do not want to redeploy the entire application. This is the same behavior that most application servers offers to JSP pages, but with WebLogic Server 12c, you have the same feature for Java classes in general. JBoss EAP 6 on the other hand has no such support. Even JBoss EAP 5 does not support this until now. 8) JMS and Messaging - WebLogic Server 12c has a proven and high scalable JMS implementation since its initial release in 1995. JBoss EAP 6 on the other hand has a still immature technology called HornetQ, which was introduced in JBoss EAP 5 replacing everything that was implemented in the previous versions. Red Hat loves to introduce new technologies across JBoss versions, playing around with customers and their investments. And when they are asked about why they have changed the implementation and caused such a mess, their answer is always: "the previous implementation was inadequate and not aligned with the community strategy so we are creating a new a improved one". This Red Hat practice leads to uncomfortable investments that in a near future (sometimes less than a year) will be affected in someway. - WebLogic Server 12c has troubleshooting and monitoring features included on the WebLogic console and WLDF. JBoss EAP 6 on the other hand has no direct monitoring on the console, activity is reflected only on the logs, no debug logs available in case of JMS issues. - WebLogic Server 12c has extremely good performance and scalability. JBoss EAP 6 on the other hand has a JMS storage mechanism relying on Oracle database or MySQL. This means that if an issue in production happens and Red Hat affirms that an performance issue is happening due to database problems, they will not support you on the performance issue. They will orient you to call Oracle instead. - WebLogic Server 12c supports messaging enterprise features like SAF ("Store and Forward"), Distributed Queues/Topics and Foreign JMS providers support that leverage JMS implementations without compromise developer code making things completely transparent. JBoss EAP 6 on the other hand do not even dream to support such features. 9) Caching and Grid - Coherence, which is the leading and most mature data grid technology from Oracle, is available since early 2000 and was integrated with WebLogic in 2009. Coherence and WebLogic clusters can be both managed from WebLogic administrative console. Even Node Manager supports Coherence. JBoss on the other hand discontinued JBoss Cache, which was their caching implementation just like they did with the messaging implementation (JBossMQ) which was a issue for long term customers. JBoss EAP 6 ships InfiniSpan version 1.0 which is immature and lack a proven record of successful cases and reliability. - WebLogic Server 12c has a feature called ActiveCache which uses Coherence to, without any code changes, replicate HTTP sessions from both WebLogic and other application servers like JBoss, Tomcat, Websphere, GlassFish and even Microsoft IIS. JBoss EAP 6 on the other hand does have such support and even when they do in the future, they probably will support only their own application server. - Coherence can be used to manage both L1 and L2 cache levels, providing support to Oracle TopLink and others JPA compliant implementations, even Hibernate. JBoss EAP 6 and Infinispan on the other hand supports only Hibernate. And most important of all: Infinispan does not have any successful case of L1 or L2 caching level support using Hibernate, which lead us to reflect about its viability. 10) Performance - WebLogic Server 12c is certified with Oracle Exalogic Elastic Cloud and can run unchanged applications at this engineered system. This approach can benefit customers from Exalogic optimization's of both kernel and JVM layers to boost performance in terms of 10X for web, OLTP, JMS and grid applications. JBoss EAP 6 on the other hand has no investment on engineered systems: customers do not have the choice to deploy on a Java ultra fast system if their project becomes relevant and performance issues are detected. - WebLogic Server 12c maintains a performance gain across each new release: starting on WebLogic 5.1, the overall performance gain has been close to 4X, which close to a 20% gain release by release. JBoss on the other hand does not provide SPECJAppServer or SPECJEnterprise performance benchmarks. Their so called "performance gains" remains hidden in their customer environments, which lead us to think if it is true or not since we will never get access to those environments. - WebLogic Server 12c has industry performance benchmarks with submissions across platforms and configurations leading SPECJ. Oracle WebLogic leads SPECJAppServer performance in multiple categories, fitting all customer topologies like: dual-node, single-node, multi-node and multi-node with RAC. JBoss... again, does not provide any SPECJAppServer performance benchmarks. - WebLogic Server 12c has a feature called work manager which allows your application to embrace new performance levels based on critical resource utilization of the CPUs usage. Work managers prioritizes work and allocates threads based on an execution model that takes into account administrator-defined parameters and actual run-time performance and throughput. JBoss EAP 6 on the other hand has no compared feature and probably they never will. Not supporting such feature like work managers, JBoss EAP 6 forces admin people and specially developers to uncover performance gains in a intrusive way, rewriting the code and doing performance refactorings. 11) Professional Services Support - WebLogic Server 12c and any other technology sold by Oracle give customers the possibility of hire OCS ("Oracle Consulting Services") to manage critical scenarios, deployment assistance of new applications, high skilled consultancy of architecture, best practices and people allocation together with customer teams. All OCS services are available without any restrictions, having the customer bought software from Oracle or just starting their implementation before any acquisition. JBoss EAP 6 or Red Hat to be more specifically, only offers professional services if you buy subscriptions from them. If you are developing a new critical application for your business and need the help of Red Hat for a serious issue or architecture decision, they will probably say: "OK... I can help you but after you buy subscriptions from me". Red Hat also does not allows their professional services consultants to manage environments that uses community based software. They will probably force you to first buy a subscription, download their "enterprise" version and them, optionally hire their consultants. - Oracle provides you our university to educate your team into our technologies, including of course specialized trainings of WebLogic application server. At any time and location, you can hire Oracle to train your team so you get trustful knowledge according to your specific needs. Certifications for the products are also available if your technical people desire to differentiate themselves as professionals. Red Hat on the other hand have a limited pool of resources to train your team in their technologies. Basically they are selling training and certification for RHEL ("Red Hat Enterprise Linux") but if you demand more specialized training in JBoss middleware, they will probably connect you to some "certified" partner localized training since they are apparently discontinuing their education center, at least here in Brazil. They were not able to reproduce their success with RHEL education to their middleware division since they need first sell the subscriptions to after gives you specialized training. And again, they only offer you specialized training based on their enterprise version (EAP in the case of JBoss) which means that the courses will be a quite outdated. There are reports of developers that took official training's from Red Hat at this year (2012) and in a certain JBoss advanced course, Red Hat supposedly covered JBossMQ as the messaging subsystem, and even the printed material provided was based on JBossMQ since the training was created for JBoss EAP 4.3. 12) Encouraging Transparency without Ulterior Motives - WebLogic Server 12c like any other software from Oracle can be downloaded any time from anywhere, you should only possess an OTN ("Oracle Technology Network") credential and you can download any enterprise software how many times you want. And is not some kind of "trial" version. It is the official binaries that will be running for ever in your data center. Oracle does not encourages the usage of "specific versions" of our software. The binaries you buy from Oracle are the same binaries anyone in the world could download and use for testing and personal education. JBoss EAP 6 on the other hand are not available for download unless you buy a subscription and get access to the Red Hat enterprise repositories. If you need to test, learn or just start creating your application using Red Hat's middleware software, you should download it from the community website. You are not allowed to download the enterprise version that, according to Red Hat are more secure, reliable and robust. But no one of us want to start the development of a software with an unsecured, unreliable and not scalable middleware right? So what you do? You are "invited" by Red Hat to buy subscriptions from them to get access to the "cool" version of the software. - WebLogic Server 12c prices are publicly available in the Oracle website. If you want to know right now how much WebLogic will cost to your organization, just click here and get access to our price list. In the case of WebLogic, check out the "US Oracle Technology Commercial Price List". Oracle also encourages you to get in touch with a sales representative to discuss discounts that would make possible the investment into our technology. But you are not required to do this, only if you are interested in buying our technology or maybe you want to discuss some discount scenarios. JBoss EAP 6 on the other hand does not have its cost publicly available in Red Hat's website or in any other media, at least is not so easy to get such information. The only link you will possibly find in their website is a "Contact a Sales Representative" link. This is not a very good relationship between an customer and an vendor. This is not an example of transparency, mainly when the software are sold as open. In this situations, customers expects to see the software prices publicly available, so they can have the chance to decide, based on the existing features of the software, if the cost is fair or not. Conclusion Oracle WebLogic is the most mature, secure, reliable and scalable Java EE application server of the market, and have a proven record of success around the globe to prove it's majority. Don't lose the chance to discover today how WebLogic could fit your needs and sustain your global IT middleware strategy, no matter if your strategy are completely based on the Cloud or not.

Read the article
256 Windows Azure Worker Roles, Windows Kinect and a 90's Text-Based Ray-Tracer

- by Alan Smith

For a couple of years I have been demoing a simple render farm hosted in Windows Azure using worker roles and the Azure Storage service. At the start of the presentation I deploy an Azure application that uses 16 worker roles to render a 1,500 frame 3D ray-traced animation. At the end of the presentation, when the animation was complete, I would play the animation delete the Azure deployment. The standing joke with the audience was that it was that it was a “$2 demo”, as the compute charges for running the 16 instances for an hour was $1.92, factor in the bandwidth charges and it’s a couple of dollars. The point of the demo is that it highlights one of the great benefits of cloud computing, you pay for what you use, and if you need massive compute power for a short period of time using Windows Azure can work out very cost effective. The “$2 demo” was great for presenting at user groups and conferences in that it could be deployed to Azure, used to render an animation, and then removed in a one hour session. I have always had the idea of doing something a bit more impressive with the demo, and scaling it from a “$2 demo” to a “$30 demo”. The challenge was to create a visually appealing animation in high definition format and keep the demo time down to one hour. This article will take a run through how I achieved this. Ray Tracing Ray tracing, a technique for generating high quality photorealistic images, gained popularity in the 90’s with companies like Pixar creating feature length computer animations, and also the emergence of shareware text-based ray tracers that could run on a home PC. In order to render a ray traced image, the ray of light that would pass from the view point must be tracked until it intersects with an object. At the intersection, the color, reflectiveness, transparency, and refractive index of the object are used to calculate if the ray will be reflected or refracted. Each pixel may require thousands of calculations to determine what color it will be in the rendered image. Pin-Board Toys Having very little artistic talent and a basic understanding of maths I decided to focus on an animation that could be modeled fairly easily and would look visually impressive. I’ve always liked the pin-board desktop toys that become popular in the 80’s and when I was working as a 3D animator back in the 90’s I always had the idea of creating a 3D ray-traced animation of a pin-board, but never found the energy to do it. Even if I had a go at it, the render time to produce an animation that would look respectable on a 486 would have been measured in months. PolyRay Back in 1995 I landed my first real job, after spending three years being a beach-ski-climbing-paragliding-bum, and was employed to create 3D ray-traced animations for a CD-ROM that school kids would use to learn physics. I had got into the strange and wonderful world of text-based ray tracing, and was using a shareware ray-tracer called PolyRay. PolyRay takes a text file describing a scene as input and, after a few hours processing on a 486, produced a high quality ray-traced image. The following is an example of a basic PolyRay scene file. background Midnight_Blue static define matte surface { ambient 0.1 diffuse 0.7 } define matte_white texture { matte { color white } } define matte_black texture { matte { color dark_slate_gray } } define position_cylindrical 3 define lookup_sawtooth 1 define light_wood <0.6, 0.24, 0.1> define median_wood <0.3, 0.12, 0.03> define dark_wood <0.05, 0.01, 0.005> define wooden texture { noise surface { ambient 0.2 diffuse 0.7 specular white, 0.5 microfacet Reitz 10 position_fn position_cylindrical position_scale 1 lookup_fn lookup_sawtooth octaves 1 turbulence 1 color_map( [0.0, 0.2, light_wood, light_wood] [0.2, 0.3, light_wood, median_wood] [0.3, 0.4, median_wood, light_wood] [0.4, 0.7, light_wood, light_wood] [0.7, 0.8, light_wood, median_wood] [0.8, 0.9, median_wood, light_wood] [0.9, 1.0, light_wood, dark_wood]) } } define glass texture { surface { ambient 0 diffuse 0 specular 0.2 reflection white, 0.1 transmission white, 1, 1.5 }} define shiny surface { ambient 0.1 diffuse 0.6 specular white, 0.6 microfacet Phong 7 } define steely_blue texture { shiny { color black } } define chrome texture { surface { color white ambient 0.0 diffuse 0.2 specular 0.4 microfacet Phong 10 reflection 0.8 } } viewpoint { from <4.000, -1.000, 1.000> at <0.000, 0.000, 0.000> up <0, 1, 0> angle 60 resolution 640, 480 aspect 1.6 image_format 0 } light <-10, 30, 20> light <-10, 30, -20> object { disc <0, -2, 0>, <0, 1, 0>, 30 wooden } object { sphere <0.000, 0.000, 0.000>, 1.00 chrome } object { cylinder <0.000, 0.000, 0.000>, <0.000, 0.000, -4.000>, 0.50 chrome } After setting up the background and defining colors and textures, the viewpoint is specified. The “camera” is located at a point in 3D space, and it looks towards another point. The angle, image resolution, and aspect ratio are specified. Two lights are present in the image at defined coordinates. The three objects in the image are a wooden disc to represent a table top, and a sphere and cylinder that intersect to form a pin that will be used for the pin board toy in the final animation. When the image is rendered, the following image is produced. The pins are modeled with a chrome surface, so they reflect the environment around them. Note that the scale of the pin shaft is not correct, this will be fixed later. Modeling the Pin Board The frame of the pin-board is made up of three boxes, and six cylinders, the front box is modeled using a clear, slightly reflective solid, with the same refractive index of glass. The other shapes are modeled as metal. object { box <-5.5, -1.5, 1>, <5.5, 5.5, 1.2> glass } object { box <-5.5, -1.5, -0.04>, <5.5, 5.5, -0.09> steely_blue } object { box <-5.5, -1.5, -0.52>, <5.5, 5.5, -0.59> steely_blue } object { cylinder <-5.2, -1.2, 1.4>, <-5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, -1.2, 1.4>, <5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <-5.2, 5.2, 1.4>, <-5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, 5.2, 1.4>, <5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <0, -1.2, 1.4>, <0, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <0, 5.2, 1.4>, <0, 5.2, -0.74>, 0.2 steely_blue } In order to create the matrix of pins that make up the pin board I used a basic console application with a few nested loops to create two intersecting matrixes of pins, which models the layout used in the pin boards. The resulting image is shown below. The pin board contains 11,481 pins, with the scene file containing 23,709 lines of code. For the complete animation 2,000 scene files will be created, which is over 47 million lines of code. Each pin in the pin-board will slide out a specific distance when an object is pressed into the back of the board. This is easily modeled by setting the Z coordinate of the pin to a specific value. In order to set all of the pins in the pin-board to the correct position, a bitmap image can be used. The position of the pin can be set based on the color of the pixel at the appropriate position in the image. When the Windows Azure logo is used to set the Z coordinate of the pins, the following image is generated. The challenge now was to make a cool animation. The Azure Logo is fine, but it is static. Using a normal video to animate the pins would not work; the colors in the video would not be the same as the depth of the objects from the camera. In order to simulate the pin board accurately a series of frames from a depth camera could be used. Windows Kinect The Kenect controllers for the X-Box 360 and Windows feature a depth camera. The Kinect SDK for Windows provides a programming interface for Kenect, providing easy access for .NET developers to the Kinect sensors. The Kinect Explorer provided with the Kinect SDK is a great starting point for exploring Kinect from a developers perspective. Both the X-Box 360 Kinect and the Windows Kinect will work with the Kinect SDK, the Windows Kinect is required for commercial applications, but the X-Box Kinect can be used for hobby projects. The Windows Kinect has the advantage of providing a mode to allow depth capture with objects closer to the camera, which makes for a more accurate depth image for setting the pin positions. Creating a Depth Field Animation The depth field animation used to set the positions of the pin in the pin board was created using a modified version of the Kinect Explorer sample application. In order to simulate the pin board accurately, a small section of the depth range from the depth sensor will be used. Any part of the object in front of the depth range will result in a white pixel; anything behind the depth range will be black. Within the depth range the pixels in the image will be set to RGB values from 0,0,0 to 255,255,255. A screen shot of the modified Kinect Explorer application is shown below. The Kinect Explorer sample application was modified to include slider controls that are used to set the depth range that forms the image from the depth stream. This allows the fine tuning of the depth image that is required for simulating the position of the pins in the pin board. The Kinect Explorer was also modified to record a series of images from the depth camera and save them as a sequence JPEG files that will be used to animate the pins in the animation the Start and Stop buttons are used to start and stop the image recording. En example of one of the depth images is shown below. Once a series of 2,000 depth images has been captured, the task of creating the animation can begin. Rendering a Test Frame In order to test the creation of frames and get an approximation of the time required to render each frame a test frame was rendered on-premise using PolyRay. The output of the rendering process is shown below. The test frame contained 23,629 primitive shapes, most of which are the spheres and cylinders that are used for the 11,800 or so pins in the pin board. The 1280x720 image contains 921,600 pixels, but as anti-aliasing was used the number of rays that were calculated was 4,235,777, with 3,478,754,073 object boundaries checked. The test frame of the pin board with the depth field image applied is shown below. The tracing time for the test frame was 4 minutes 27 seconds, which means rendering the2,000 frames in the animation would take over 148 hours, or a little over 6 days. Although this is much faster that an old 486, waiting almost a week to see the results of an animation would make it challenging for animators to create, view, and refine their animations. It would be much better if the animation could be rendered in less than one hour. Windows Azure Worker Roles The cost of creating an on-premise render farm to render animations increases in proportion to the number of servers. The table below shows the cost of servers for creating a render farm, assuming a cost of $500 per server. Number of Servers Cost 1 $500 16 $8,000 256 $128,000 As well as the cost of the servers, there would be additional costs for networking, racks etc. Hosting an environment of 256 servers on-premise would require a server room with cooling, and some pretty hefty power cabling. The Windows Azure compute services provide worker roles, which are ideal for performing processor intensive compute tasks. With the scalability available in Windows Azure a job that takes 256 hours to complete could be perfumed using different numbers of worker roles. The time and cost of using 1, 16 or 256 worker roles is shown below. Number of Worker Roles Render Time Cost 1 256 hours $30.72 16 16 hours $30.72 256 1 hour $30.72 Using worker roles in Windows Azure provides the same cost for the 256 hour job, irrespective of the number of worker roles used. Provided the compute task can be broken down into many small units, and the worker role compute power can be used effectively, it makes sense to scale the application so that the task is completed quickly, making the results available in a timely fashion. The task of rendering 2,000 frames in an animation is one that can easily be broken down into 2,000 individual pieces, which can be performed by a number of worker roles. Creating a Render Farm in Windows Azure The architecture of the render farm is shown in the following diagram. The render farm is a hybrid application with the following components: · On-Premise o Windows Kinect – Used combined with the Kinect Explorer to create a stream of depth images. o Animation Creator – This application uses the depth images from the Kinect sensor to create scene description files for PolyRay. These files are then uploaded to the jobs blob container, and job messages added to the jobs queue. o Process Monitor – This application queries the role instance lifecycle table and displays statistics about the render farm environment and render process. o Image Downloader – This application polls the image queue and downloads the rendered animation files once they are complete. · Windows Azure o Azure Storage – Queues and blobs are used for the scene description files and completed frames. A table is used to store the statistics about the rendering environment. The architecture of each worker role is shown below. The worker role is configured to use local storage, which provides file storage on the worker role instance that can be use by the applications to render the image and transform the format of the image. The service definition for the worker role with the local storage configuration highlighted is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceDefinition name="CloudRay" > <WorkerRole name="CloudRayWorkerRole" vmsize="Small"> <Imports> </Imports> <ConfigurationSettings> <Setting name="DataConnectionString" /> </ConfigurationSettings> <LocalResources> <LocalStorage name="RayFolder" cleanOnRoleRecycle="true" /> </LocalResources> </WorkerRole> </ServiceDefinition> The two executable programs, PolyRay.exe and DTA.exe are included in the Azure project, with Copy Always set as the property. PolyRay will take the scene description file and render it to a Truevision TGA file. As the TGA format has not seen much use since the mid 90’s it is converted to a JPG image using Dave's Targa Animator, another shareware application from the 90’s. Each worker roll will use the following process to render the animation frames. 1. The worker process polls the job queue, if a job is available the scene description file is downloaded from blob storage to local storage. 2. PolyRay.exe is started in a process with the appropriate command line arguments to render the image as a TGA file. 3. DTA.exe is started in a process with the appropriate command line arguments convert the TGA file to a JPG file. 4. The JPG file is uploaded from local storage to the images blob container. 5. A message is placed on the images queue to indicate a new image is available for download. 6. The job message is deleted from the job queue. 7. The role instance lifecycle table is updated with statistics on the number of frames rendered by the worker role instance, and the CPU time used. The code for this is shown below. public override void Run() { // Set environment variables string polyRayPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), PolyRayLocation); string dtaPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), DTALocation); LocalResource rayStorage = RoleEnvironment.GetLocalResource("RayFolder"); string localStorageRootPath = rayStorage.RootPath; JobQueue jobQueue = new JobQueue("renderjobs"); JobQueue downloadQueue = new JobQueue("renderimagedownloadjobs"); CloudRayBlob sceneBlob = new CloudRayBlob("scenes"); CloudRayBlob imageBlob = new CloudRayBlob("images"); RoleLifecycleDataSource roleLifecycleDataSource = new RoleLifecycleDataSource(); Frames = 0; while (true) { // Get the render job from the queue CloudQueueMessage jobMsg = jobQueue.Get(); if (jobMsg != null) { // Get the file details string sceneFile = jobMsg.AsString; string tgaFile = sceneFile.Replace(".pi", ".tga"); string jpgFile = sceneFile.Replace(".pi", ".jpg"); string sceneFilePath = Path.Combine(localStorageRootPath, sceneFile); string tgaFilePath = Path.Combine(localStorageRootPath, tgaFile); string jpgFilePath = Path.Combine(localStorageRootPath, jpgFile); // Copy the scene file to local storage sceneBlob.DownloadFile(sceneFilePath); // Run the ray tracer. string polyrayArguments = string.Format("\"{0}\" -o \"{1}\" -a 2", sceneFilePath, tgaFilePath); Process polyRayProcess = new Process(); polyRayProcess.StartInfo.FileName = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), polyRayPath); polyRayProcess.StartInfo.Arguments = polyrayArguments; polyRayProcess.Start(); polyRayProcess.WaitForExit(); // Convert the image string dtaArguments = string.Format(" {0} /FJ /P{1}", tgaFilePath, Path.GetDirectoryName (jpgFilePath)); Process dtaProcess = new Process(); dtaProcess.StartInfo.FileName = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), dtaPath); dtaProcess.StartInfo.Arguments = dtaArguments; dtaProcess.Start(); dtaProcess.WaitForExit(); // Upload the image to blob storage imageBlob.UploadFile(jpgFilePath); // Add a download job. downloadQueue.Add(jpgFile); // Delete the render job message jobQueue.Delete(jobMsg); Frames++; } else { Thread.Sleep(1000); } // Log the worker role activity. roleLifecycleDataSource.Alive ("CloudRayWorker", RoleLifecycleDataSource.RoleLifecycleId, Frames); } } Monitoring Worker Role Instance Lifecycle In order to get more accurate statistics about the lifecycle of the worker role instances used to render the animation data was tracked in an Azure storage table. The following class was used to track the worker role lifecycles in Azure storage. public class RoleLifecycle : TableServiceEntity { public string ServerName { get; set; } public string Status { get; set; } public DateTime StartTime { get; set; } public DateTime EndTime { get; set; } public long SecondsRunning { get; set; } public DateTime LastActiveTime { get; set; } public int Frames { get; set; } public string Comment { get; set; } public RoleLifecycle() { } public RoleLifecycle(string roleName) { PartitionKey = roleName; RowKey = Utils.GetAscendingRowKey(); Status = "Started"; StartTime = DateTime.UtcNow; LastActiveTime = StartTime; EndTime = StartTime; SecondsRunning = 0; Frames = 0; } } A new instance of this class is created and added to the storage table when the role starts. It is then updated each time the worker renders a frame to record the total number of frames rendered and the total processing time. These statistics are used be the monitoring application to determine the effectiveness of use of resources in the render farm. Rendering the Animation The Azure solution was deployed to Windows Azure with the service configuration set to 16 worker role instances. This allows for the application to be tested in the cloud environment, and the performance of the application determined. When I demo the application at conferences and user groups I often start with 16 instances, and then scale up the application to the full 256 instances. The configuration to run 16 instances is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*"> <Role name="CloudRayWorkerRole"> <Instances count="16" /> <ConfigurationSettings> <Setting name="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." /> </ConfigurationSettings> </Role> </ServiceConfiguration> About six minutes after deploying the application the first worker roles become active and start to render the first frames of the animation. The CloudRay Monitor application displays an icon for each worker role instance, with a number indicating the number of frames that the worker role has rendered. The statistics on the left show the number of active worker roles and statistics about the render process. The render time is the time since the first worker role became active; the CPU time is the total amount of processing time used by all worker role instances to render the frames. Five minutes after the first worker role became active the last of the 16 worker roles activated. By this time the first seven worker roles had each rendered one frame of the animation. With 16 worker roles u and running it can be seen that one hour and 45 minutes CPU time has been used to render 32 frames with a render time of just under 10 minutes. At this rate it would take over 10 hours to render the 2,000 frames of the full animation. In order to complete the animation in under an hour more processing power will be required. Scaling the render farm from 16 instances to 256 instances is easy using the new management portal. The slider is set to 256 instances, and the configuration saved. We do not need to re-deploy the application, and the 16 instances that are up and running will not be affected. Alternatively, the configuration file for the Azure service could be modified to specify 256 instances. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*"> <Role name="CloudRayWorkerRole"> <Instances count="256" /> <ConfigurationSettings> <Setting name="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." /> </ConfigurationSettings> </Role> </ServiceConfiguration> Six minutes after the new configuration has been applied 75 new worker roles have activated and are processing their first frames. Five minutes later the full configuration of 256 worker roles is up and running. We can see that the average rate of frame rendering has increased from 3 to 12 frames per minute, and that over 17 hours of CPU time has been utilized in 23 minutes. In this test the time to provision 140 worker roles was about 11 minutes, which works out at about one every five seconds. We are now half way through the rendering, with 1,000 frames complete. This has utilized just under three days of CPU time in a little over 35 minutes. The animation is now complete, with 2,000 frames rendered in a little over 52 minutes. The CPU time used by the 256 worker roles is 6 days, 7 hours and 22 minutes with an average frame rate of 38 frames per minute. The rendering of the last 1,000 frames took 16 minutes 27 seconds, which works out at a rendering rate of 60 frames per minute. The frame counts in the server instances indicate that the use of a queue to distribute the workload has been very effective in distributing the load across the 256 worker role instances. The first 16 instances that were deployed first have rendered between 11 and 13 frames each, whilst the 240 instances that were added when the application was scaled have rendered between 6 and 9 frames each. Completed Animation I’ve uploaded the completed animation to YouTube, a low resolution preview is shown below. Pin Board Animation Created using Windows Kinect and 256 Windows Azure Worker Roles The animation can be viewed in 1280x720 resolution at the following link: http://www.youtube.com/watch?v=n5jy6bvSxWc Effective Use of Resources According to the CloudRay monitor statistics the animation took 6 days, 7 hours and 22 minutes CPU to render, this works out at 152 hours of compute time, rounded up to the nearest hour. As the usage for the worker role instances are billed for the full hour, it may have been possible to render the animation using fewer than 256 worker roles. When deciding the optimal usage of resources, the time required to provision and start the worker roles must also be considered. In the demo I started with 16 worker roles, and then scaled the application to 256 worker roles. It would have been more optimal to start the application with maybe 200 worker roles, and utilized the full hour that I was being billed for. This would, however, have prevented showing the ease of scalability of the application. The new management portal displays the CPU usage across the worker roles in the deployment. The average CPU usage across all instances is 93.27%, with over 99% used when all the instances are up and running. This shows that the worker role resources are being used very effectively. Grid Computing Scenarios Although I am using this scenario for a hobby project, there are many scenarios where a large amount of compute power is required for a short period of time. Windows Azure provides a great platform for developing these types of grid computing applications, and can work out very cost effective. · Windows Azure can provide massive compute power, on demand, in a matter of minutes. · The use of queues to manage the load balancing of jobs between role instances is a simple and effective solution. · Using a cloud-computing platform like Windows Azure allows proof-of-concept scenarios to be tested and evaluated on a very low budget. · No charges for inbound data transfer makes the uploading of large data sets to Windows Azure Storage services cost effective. (Transaction charges still apply.) Tips for using Windows Azure for Grid Computing Scenarios I found the implementation of a render farm using Windows Azure a fairly simple scenario to implement. I was impressed by ease of scalability that Azure provides, and by the short time that the application took to scale from 16 to 256 worker role instances. In this case it was around 13 minutes, in other tests it took between 10 and 20 minutes. The following tips may be useful when implementing a grid computing project in Windows Azure. · Using an Azure Storage queue to load-balance the units of work across multiple worker roles is simple and very effective. The design I have used in this scenario could easily scale to many thousands of worker role instances. · Windows Azure accounts are typically limited to 20 cores. If you need to use more than this, a call to support and a credit card check will be required. · Be aware of how the billing model works. You will be charged for worker role instances for the full clock our in which the instance is deployed. Schedule the workload to start just after the clock hour has started. · Monitor the utilization of the resources you are provisioning, ensure that you are not paying for worker roles that are idle. · If you are deploying third party applications to worker roles, you may well run into licensing issues. Purchasing software licenses on a per-processor basis when using hundreds of processors for a short time period would not be cost effective. · Third party software may also require installation onto the worker roles, which can be accomplished using start-up tasks. Bear in mind that adding a startup task and possible re-boot will add to the time required for the worker role instance to start and activate. An alternative may be to use a prepared VM and use VM roles. · Consider using the Windows Azure Autoscaling Application Block (WASABi) to autoscale the worker roles in your application. When using a large number of worker roles, the utilization must be carefully monitored, if the scaling algorithms are not optimal it could get very expensive!

Read the article
Did I lose my RAID again?

- by BarsMonster

Hi! A little history: 2 years ago I was really excited to find out that mdadm is so powerful that it even can reshape arrays, so you can start with a smaller array and then grow it as you need. I've bought 3x1Tb drives and made a RAID-5. It was fine for a year. Then I bought 2x more, and tried to reshape to RAID-6 out of 5 drives, and due to some mess with superblock versions, lost all content. Had to rebuild it from scratch, but 2Tb of data were gone. Yesterday I bought 2 more drives, and this time I had everything: properly built array, UPS. I've disabled write intent map, added 2 new drives as spares and run a command to grow array to 7-disks. It started working, but speed was ridiculously slow, ~100kb/sec. After processing first 37Mb at such an amazing speed, one of old HDDs fails. I properly shutdown the PC and disconnected the failed drive. After bootup it appeared that it recreated the intent map as it was still in mdadm config, so I removed it from config and rebooted again. Now all I see is that all mdadm processes deadlock, and don't do anything. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1937 root 20 0 12992 608 444 D 0 0.1 0:00.00 mdadm 2283 root 20 0 12992 852 704 D 0 0.1 0:00.01 mdadm 2287 root 20 0 0 0 0 D 0 0.0 0:00.01 md0_reshape 2288 root 18 -2 12992 820 676 D 0 0.1 0:00.01 mdadm And all I see in mdstat is: $ cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sdb1[1] sdg1[4] sdf1[7] sde1[6] sdd1[0] sdc1[5] 2929683456 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [7/6] [UU_UUUU] [>....................] reshape = 0.0% (37888/976561152) finish=567604147.2min speed=0K/sec I've already tried mdadm 2.6.7, 3.1.4 and 3.2 - nothing helps. Did I lose my data again? Any suggestions on how can I make this work? OS is Ubuntu Server 10.04.2. PS. Needless to say, the data is inaccessible - I cannot mount /dev/md0 to save the most valuable data. You can see my disappointment - the very specific thing I was excited about failed twice taking 5Tb of my data with it. Update: It appears there is some nice info in kern.log: 21:38:48 ...: [ 166.522055] raid5: reshape will continue 21:38:48 ...: [ 166.522085] raid5: device sdb1 operational as raid disk 1 21:38:48 ...: [ 166.522091] raid5: device sdg1 operational as raid disk 4 21:38:48 ...: [ 166.522097] raid5: device sdf1 operational as raid disk 5 21:38:48 ...: [ 166.522102] raid5: device sde1 operational as raid disk 6 21:38:48 ...: [ 166.522107] raid5: device sdd1 operational as raid disk 0 21:38:48 ...: [ 166.522111] raid5: device sdc1 operational as raid disk 3 21:38:48 ...: [ 166.523942] raid5: allocated 7438kB for md0 21:38:48 ...: [ 166.524041] 1: w=1 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524050] 4: w=2 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524056] 5: w=3 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524062] 6: w=4 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524068] 0: w=5 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524073] 3: w=6 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524079] raid5: raid level 6 set md0 active with 6 out of 7 devices, algorithm 2 21:38:48 ...: [ 166.524519] RAID5 conf printout: 21:38:48 ...: [ 166.524523] --- rd:7 wd:6 21:38:48 ...: [ 166.524528] disk 0, o:1, dev:sdd1 21:38:48 ...: [ 166.524532] disk 1, o:1, dev:sdb1 21:38:48 ...: [ 166.524537] disk 3, o:1, dev:sdc1 21:38:48 ...: [ 166.524541] disk 4, o:1, dev:sdg1 21:38:48 ...: [ 166.524545] disk 5, o:1, dev:sdf1 21:38:48 ...: [ 166.524550] disk 6, o:1, dev:sde1 21:38:48 ...: [ 166.524553] ...ok start reshape thread 21:38:48 ...: [ 166.524727] md0: detected capacity change from 0 to 2999995858944 21:38:48 ...: [ 166.524735] md: reshape of RAID array md0 21:38:48 ...: [ 166.524740] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. 21:38:48 ...: [ 166.524745] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape. 21:38:48 ...: [ 166.524756] md: using 128k window, over a total of 976561152 blocks. 21:39:05 ...: [ 166.525013] md0: 21:42:04 ...: [ 362.520063] INFO: task mdadm:1937 blocked for more than 120 seconds. 21:42:04 ...: [ 362.520068] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:42:04 ...: [ 362.520073] mdadm D 00000000ffffffff 0 1937 1 0x00000000 21:42:04 ...: [ 362.520083] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0 21:42:04 ...: [ 362.520092] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0 21:42:04 ...: [ 362.520100] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198 21:42:04 ...: [ 362.520107] Call Trace: 21:42:04 ...: [ 362.520133] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:42:04 ...: [ 362.520148] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:42:04 ...: [ 362.520159] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456] 21:42:04 ...: [ 362.520169] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456] 21:42:04 ...: [ 362.520179] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:42:04 ...: [ 362.520188] [<ffffffff81414df0>] md_make_request+0xc0/0x130 21:42:04 ...: [ 362.520194] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130 21:42:04 ...: [ 362.520205] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0 21:42:04 ...: [ 362.520214] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20 21:42:04 ...: [ 362.520222] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60 21:42:04 ...: [ 362.520230] [<ffffffff8129fc80>] submit_bio+0x80/0x110 21:42:04 ...: [ 362.520236] [<ffffffff8116c849>] submit_bh+0xf9/0x140 21:42:04 ...: [ 362.520244] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0 21:42:04 ...: [ 362.520251] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70 21:42:04 ...: [ 362.520258] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40 21:42:04 ...: [ 362.520265] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160 21:42:04 ...: [ 362.520272] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20 21:42:04 ...: [ 362.520279] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0 21:42:04 ...: [ 362.520285] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:42:04 ...: [ 362.520290] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:42:04 ...: [ 362.520297] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120 21:42:04 ...: [ 362.520304] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20 21:42:04 ...: [ 362.520310] [<ffffffff810f591e>] read_cache_page+0xe/0x20 21:42:04 ...: [ 362.520317] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0 21:42:04 ...: [ 362.520324] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460 21:42:04 ...: [ 362.520331] [<ffffffff811a7938>] check_partition+0x138/0x190 21:42:04 ...: [ 362.520338] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0 21:42:04 ...: [ 362.520344] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0 21:42:04 ...: [ 362.520350] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:42:04 ...: [ 362.520356] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:42:04 ...: [ 362.520362] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:42:04 ...: [ 362.520369] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:42:04 ...: [ 362.520377] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:42:04 ...: [ 362.520385] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:42:04 ...: [ 362.520391] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:42:04 ...: [ 362.520398] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:42:04 ...: [ 362.520406] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310 21:42:04 ...: [ 362.520414] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:42:04 ...: [ 362.520421] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:42:04 ...: [ 362.520428] [<ffffffff811418b0>] sys_open+0x20/0x30 21:42:04 ...: [ 362.520437] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:42:04 ...: [ 362.520446] INFO: task mdadm:2283 blocked for more than 120 seconds. 21:42:04 ...: [ 362.520450] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:42:04 ...: [ 362.520454] mdadm D 0000000000000000 0 2283 2212 0x00000000 21:42:04 ...: [ 362.520462] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0 21:42:04 ...: [ 362.520470] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0 21:42:04 ...: [ 362.520478] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78 21:42:04 ...: [ 362.520485] Call Trace: 21:42:04 ...: [ 362.520495] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:42:04 ...: [ 362.520502] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:42:04 ...: [ 362.520508] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190 21:42:04 ...: [ 362.520514] [<ffffffff811741b0>] blkdev_put+0x10/0x20 21:42:04 ...: [ 362.520520] [<ffffffff811741f3>] blkdev_close+0x33/0x60 21:42:04 ...: [ 362.520527] [<ffffffff81145375>] __fput+0xf5/0x210 21:42:04 ...: [ 362.520534] [<ffffffff811454b5>] fput+0x25/0x30 21:42:04 ...: [ 362.520540] [<ffffffff811415ad>] filp_close+0x5d/0x90 21:42:04 ...: [ 362.520546] [<ffffffff81141697>] sys_close+0xb7/0x120 21:42:04 ...: [ 362.520553] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:42:04 ...: [ 362.520559] INFO: task md0_reshape:2287 blocked for more than 120 seconds. 21:42:04 ...: [ 362.520563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:42:04 ...: [ 362.520567] md0_reshape D ffff88003aee96f0 0 2287 2 0x00000000 21:42:04 ...: [ 362.520575] ffff88003cf05a70 0000000000000046 0000000000015bc0 0000000000015bc0 21:42:04 ...: [ 362.520582] ffff88003aee9aa8 ffff88003cf05fd8 0000000000015bc0 ffff88003aee96f0 21:42:04 ...: [ 362.520590] 0000000000015bc0 ffff88003cf05fd8 0000000000015bc0 ffff88003aee9aa8 21:42:04 ...: [ 362.520597] Call Trace: 21:42:04 ...: [ 362.520608] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:42:04 ...: [ 362.520616] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:42:04 ...: [ 362.520626] [<ffffffffa0226f80>] reshape_request+0x4c0/0x9a0 [raid456] 21:42:04 ...: [ 362.520634] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:42:04 ...: [ 362.520644] [<ffffffffa022777a>] sync_request+0x31a/0x3a0 [raid456] 21:42:04 ...: [ 362.520651] [<ffffffff81052713>] ? __wake_up+0x53/0x70 21:42:04 ...: [ 362.520658] [<ffffffff814156b1>] md_do_sync+0x621/0xbb0 21:42:04 ...: [ 362.520668] [<ffffffff810387b9>] ? default_spin_lock_flags+0x9/0x10 21:42:04 ...: [ 362.520675] [<ffffffff8141640c>] md_thread+0x5c/0x130 21:42:04 ...: [ 362.520681] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:42:04 ...: [ 362.520688] [<ffffffff814163b0>] ? md_thread+0x0/0x130 21:42:04 ...: [ 362.520694] [<ffffffff81084416>] kthread+0x96/0xa0 21:42:04 ...: [ 362.520701] [<ffffffff810131ea>] child_rip+0xa/0x20 21:42:04 ...: [ 362.520707] [<ffffffff81084380>] ? kthread+0x0/0xa0 21:42:04 ...: [ 362.520713] [<ffffffff810131e0>] ? child_rip+0x0/0x20 21:42:04 ...: [ 362.520718] INFO: task mdadm:2288 blocked for more than 120 seconds. 21:42:04 ...: [ 362.520721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:42:04 ...: [ 362.520725] mdadm D 0000000000000000 0 2288 1 0x00000000 21:42:04 ...: [ 362.520733] ffff88002cca9c18 0000000000000086 0000000000015bc0 0000000000015bc0 21:42:04 ...: [ 362.520741] ffff88003aee83b8 ffff88002cca9fd8 0000000000015bc0 ffff88003aee8000 21:42:04 ...: [ 362.520748] 0000000000015bc0 ffff88002cca9fd8 0000000000015bc0 ffff88003aee83b8 21:42:04 ...: [ 362.520755] Call Trace: 21:42:04 ...: [ 362.520763] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:42:04 ...: [ 362.520771] [<ffffffff812a6d50>] ? exact_match+0x0/0x10 21:42:04 ...: [ 362.520777] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:42:04 ...: [ 362.520783] [<ffffffff811742c8>] __blkdev_get+0x68/0x3d0 21:42:04 ...: [ 362.520790] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:42:04 ...: [ 362.520795] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:42:04 ...: [ 362.520801] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:42:04 ...: [ 362.520808] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:42:04 ...: [ 362.520815] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:42:04 ...: [ 362.520821] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:42:04 ...: [ 362.520828] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:42:04 ...: [ 362.520834] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:42:04 ...: [ 362.520841] [<ffffffff810ff0e1>] ? lru_cache_add_lru+0x21/0x40 21:42:04 ...: [ 362.520848] [<ffffffff8111109c>] ? do_anonymous_page+0x11c/0x330 21:42:04 ...: [ 362.520855] [<ffffffff81115d5f>] ? handle_mm_fault+0x31f/0x3c0 21:42:04 ...: [ 362.520862] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:42:04 ...: [ 362.520868] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:42:04 ...: [ 362.520874] [<ffffffff811418b0>] sys_open+0x20/0x30 21:42:04 ...: [ 362.520882] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:44:04 ...: [ 482.520065] INFO: task mdadm:1937 blocked for more than 120 seconds. 21:44:04 ...: [ 482.520071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:44:04 ...: [ 482.520077] mdadm D 00000000ffffffff 0 1937 1 0x00000000 21:44:04 ...: [ 482.520087] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0 21:44:04 ...: [ 482.520096] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0 21:44:04 ...: [ 482.520104] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198 21:44:04 ...: [ 482.520112] Call Trace: 21:44:04 ...: [ 482.520139] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:44:04 ...: [ 482.520154] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:44:04 ...: [ 482.520165] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456] 21:44:04 ...: [ 482.520175] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456] 21:44:04 ...: [ 482.520185] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:44:04 ...: [ 482.520194] [<ffffffff81414df0>] md_make_request+0xc0/0x130 21:44:04 ...: [ 482.520201] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130 21:44:04 ...: [ 482.520212] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0 21:44:04 ...: [ 482.520221] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20 21:44:04 ...: [ 482.520229] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60 21:44:04 ...: [ 482.520237] [<ffffffff8129fc80>] submit_bio+0x80/0x110 21:44:04 ...: [ 482.520244] [<ffffffff8116c849>] submit_bh+0xf9/0x140 21:44:04 ...: [ 482.520252] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0 21:44:04 ...: [ 482.520258] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70 21:44:04 ...: [ 482.520266] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40 21:44:04 ...: [ 482.520273] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160 21:44:04 ...: [ 482.520280] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20 21:44:04 ...: [ 482.520286] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0 21:44:04 ...: [ 482.520293] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:44:04 ...: [ 482.520299] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:44:04 ...: [ 482.520306] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120 21:44:04 ...: [ 482.520313] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20 21:44:04 ...: [ 482.520319] [<ffffffff810f591e>] read_cache_page+0xe/0x20 21:44:04 ...: [ 482.520327] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0 21:44:04 ...: [ 482.520334] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460 21:44:04 ...: [ 482.520341] [<ffffffff811a7938>] check_partition+0x138/0x190 21:44:04 ...: [ 482.520348] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0 21:44:04 ...: [ 482.520355] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0 21:44:04 ...: [ 482.520361] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:44:04 ...: [ 482.520367] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:44:04 ...: [ 482.520373] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:44:04 ...: [ 482.520380] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:44:04 ...: [ 482.520388] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:44:04 ...: [ 482.520396] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:44:04 ...: [ 482.520403] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:44:04 ...: [ 482.520410] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:44:04 ...: [ 482.520417] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310 21:44:04 ...: [ 482.520426] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:44:04 ...: [ 482.520432] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:44:04 ...: [ 482.520438] [<ffffffff811418b0>] sys_open+0x20/0x30 21:44:04 ...: [ 482.520447] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:44:04 ...: [ 482.520458] INFO: task mdadm:2283 blocked for more than 120 seconds. 21:44:04 ...: [ 482.520462] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:44:04 ...: [ 482.520467] mdadm D 0000000000000000 0 2283 2212 0x00000000 21:44:04 ...: [ 482.520475] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0 21:44:04 ...: [ 482.520483] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0 21:44:04 ...: [ 482.520490] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78 21:44:04 ...: [ 482.520498] Call Trace: 21:44:04 ...: [ 482.520508] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:44:04 ...: [ 482.520515] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:44:04 ...: [ 482.520521] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190 21:44:04 ...: [ 482.520527] [<ffffffff811741b0>] blkdev_put+0x10/0x20 21:44:04 ...: [ 482.520533] [<ffffffff811741f3>] blkdev_close+0x33/0x60 21:44:04 ...: [ 482.520541] [<ffffffff81145375>] __fput+0xf5/0x210 21:44:04 ...: [ 482.520547] [<ffffffff811454b5>] fput+0x25/0x30 21:44:04 ...: [ 482.520554] [<ffffffff811415ad>] filp_close+0x5d/0x90 21:44:04 ...: [ 482.520560] [<ffffffff81141697>] sys_close+0xb7/0x120 21:44:04 ...: [ 482.520568] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:44:04 ...: [ 482.520574] INFO: task md0_reshape:2287 blocked for more than 120 seconds. 21:44:04 ...: [ 482.520578] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:44:04 ...: [ 482.520582] md0_reshape D ffff88003aee96f0 0 2287 2 0x00000000 21:44:04 ...: [ 482.520590] ffff88003cf05a70 0000000000000046 0000000000015bc0 0000000000015bc0 21:44:04 ...: [ 482.520597] ffff88003aee9aa8 ffff88003cf05fd8 0000000000015bc0 ffff88003aee96f0 21:44:04 ...: [ 482.520605] 0000000000015bc0 ffff88003cf05fd8 0000000000015bc0 ffff88003aee9aa8 21:44:04 ...: [ 482.520612] Call Trace: 21:44:04 ...: [ 482.520623] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:44:04 ...: [ 482.520633] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:44:04 ...: [ 482.520643] [<ffffffffa0226f80>] reshape_request+0x4c0/0x9a0 [raid456] 21:44:04 ...: [ 482.520651] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:44:04 ...: [ 482.520661] [<ffffffffa022777a>] sync_request+0x31a/0x3a0 [raid456] 21:44:04 ...: [ 482.520668] [<ffffffff81052713>] ? __wake_up+0x53/0x70 21:44:04 ...: [ 482.520675] [<ffffffff814156b1>] md_do_sync+0x621/0xbb0 21:44:04 ...: [ 482.520685] [<ffffffff810387b9>] ? default_spin_lock_flags+0x9/0x10 21:44:04 ...: [ 482.520692] [<ffffffff8141640c>] md_thread+0x5c/0x130 21:44:04 ...: [ 482.520699] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:44:04 ...: [ 482.520705] [<ffffffff814163b0>] ? md_thread+0x0/0x130 21:44:04 ...: [ 482.520711] [<ffffffff81084416>] kthread+0x96/0xa0 21:44:04 ...: [ 482.520718] [<ffffffff810131ea>] child_rip+0xa/0x20 21:44:04 ...: [ 482.520725] [<ffffffff81084380>] ? kthread+0x0/0xa0 21:44:04 ...: [ 482.520730] [<ffffffff810131e0>] ? child_rip+0x0/0x20 21:44:04 ...: [ 482.520735] INFO: task mdadm:2288 blocked for more than 120 seconds. 21:44:04 ...: [ 482.520739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:44:04 ...: [ 482.520743] mdadm D 0000000000000000 0 2288 1 0x00000000 21:44:04 ...: [ 482.520751] ffff88002cca9c18 0000000000000086 0000000000015bc0 0000000000015bc0 21:44:04 ...: [ 482.520759] ffff88003aee83b8 ffff88002cca9fd8 0000000000015bc0 ffff88003aee8000 21:44:04 ...: [ 482.520767] 0000000000015bc0 ffff88002cca9fd8 0000000000015bc0 ffff88003aee83b8 21:44:04 ...: [ 482.520774] Call Trace: 21:44:04 ...: [ 482.520782] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:44:04 ...: [ 482.520790] [<ffffffff812a6d50>] ? exact_match+0x0/0x10 21:44:04 ...: [ 482.520797] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:44:04 ...: [ 482.520804] [<ffffffff811742c8>] __blkdev_get+0x68/0x3d0 21:44:04 ...: [ 482.520810] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:44:04 ...: [ 482.520816] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:44:04 ...: [ 482.520822] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:44:04 ...: [ 482.520829] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:44:04 ...: [ 482.520837] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:44:04 ...: [ 482.520843] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:44:04 ...: [ 482.520850] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:44:04 ...: [ 482.520857] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:44:04 ...: [ 482.520864] [<ffffffff810ff0e1>] ? lru_cache_add_lru+0x21/0x40 21:44:04 ...: [ 482.520871] [<ffffffff8111109c>] ? do_anonymous_page+0x11c/0x330 21:44:04 ...: [ 482.520878] [<ffffffff81115d5f>] ? handle_mm_fault+0x31f/0x3c0 21:44:04 ...: [ 482.520885] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:44:04 ...: [ 482.520891] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:44:04 ...: [ 482.520897] [<ffffffff811418b0>] sys_open+0x20/0x30 21:44:04 ...: [ 482.520905] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:46:04 ...: [ 602.520053] INFO: task mdadm:1937 blocked for more than 120 seconds. 21:46:04 ...: [ 602.520059] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:46:04 ...: [ 602.520065] mdadm D 00000000ffffffff 0 1937 1 0x00000000 21:46:04 ...: [ 602.520075] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0 21:46:04 ...: [ 602.520084] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0 21:46:04 ...: [ 602.520091] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198 21:46:04 ...: [ 602.520099] Call Trace: 21:46:04 ...: [ 602.520127] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:46:04 ...: [ 602.520142] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:46:04 ...: [ 602.520153] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456] 21:46:04 ...: [ 602.520162] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456] 21:46:04 ...: [ 602.520171] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:46:04 ...: [ 602.520180] [<ffffffff81414df0>] md_make_request+0xc0/0x130 21:46:04 ...: [ 602.520187] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130 21:46:04 ...: [ 602.520197] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0 21:46:04 ...: [ 602.520206] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20 21:46:04 ...: [ 602.520215] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60 21:46:04 ...: [ 602.520222] [<ffffffff8129fc80>] submit_bio+0x80/0x110 21:46:04 ...: [ 602.520229] [<ffffffff8116c849>] submit_bh+0xf9/0x140 21:46:04 ...: [ 602.520237] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0 21:46:04 ...: [ 602.520244] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70 21:46:04 ...: [ 602.520252] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40 21:46:04 ...: [ 602.520259] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160 21:46:04 ...: [ 602.520266] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20 21:46:04 ...: [ 602.520273] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0 21:46:04 ...: [ 602.520279] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:46:04 ...: [ 602.520285] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:46:04 ...: [ 602.520292] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120 21:46:04 ...: [ 602.520300] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20 21:46:04 ...: [ 602.520306] [<ffffffff810f591e>] read_cache_page+0xe/0x20 21:46:04 ...: [ 602.520314] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0 21:46:04 ...: [ 602.520321] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460 21:46:04 ...: [ 602.520328] [<ffffffff811a7938>] check_partition+0x138/0x190 21:46:04 ...: [ 602.520335] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0 21:46:04 ...: [ 602.520342] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0 21:46:04 ...: [ 602.520348] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:46:04 ...: [ 602.520354] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:46:04 ...: [ 602.520359] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:46:04 ...: [ 602.520367] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:46:04 ...: [ 602.520375] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:46:04 ...: [ 602.520383] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:46:04 ...: [ 602.520390] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:46:04 ...: [ 602.520397] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:46:04 ...: [ 602.520404] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310 21:46:04 ...: [ 602.520413] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:46:04 ...: [ 602.520419] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:46:04 ...: [ 602.520425] [<ffffffff811418b0>] sys_open+0x20/0x30 21:46:04 ...: [ 602.520434] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:46:04 ...: [ 602.520443] INFO: task mdadm:2283 blocked for more than 120 seconds. 21:46:04 ...: [ 602.520447] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:46:04 ...: [ 602.520451] mdadm D 0000000000000000 0 2283 2212 0x00000000 21:46:04 ...: [ 602.520460] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0 21:46:04 ...: [ 602.520468] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0 21:46:04 ...: [ 602.520475] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78 21:46:04 ...: [ 602.520483] Call Trace: 21:46:04 ...: [ 602.520492] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:46:04 ...: [ 602.520500] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:46:04 ...: [ 602.520506] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190 21:46:04 ...: [ 602.520512] [<ffffffff811741b0>] blkdev_put+0x10/0x20 21:46:04 ...: [ 602.520518] [<ffffffff811741f3>] blkdev_close+0x33/0x60 21:46:04 ...: [ 602.520526] [<ffffffff81145375>] __fput+0xf5/0x210 21:46:04 ...: [ 602.520533] [<ffffffff811454b5>] fput+0x25/0x30 21:46:04 ...: [ 602.520539] [<ffffffff811415ad>] filp_close+0x5d/0x90 21:46:04 ...: [ 602.520545] [<ffffffff81141697>] sys_close+0xb7/0x120 21:46:04 ...: [ 602.520552] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b

Read the article
The Incremental Architect’s Napkin - #5 - Design functions for extensibility and readability

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/08/24/the-incremental-architectrsquos-napkin---5---design-functions-for.aspx The functionality of programs is entered via Entry Points. So what we´re talking about when designing software is a bunch of functions handling the requests represented by and flowing in through those Entry Points. Designing software thus consists of at least three phases: Analyzing the requirements to find the Entry Points and their signatures Designing the functionality to be executed when those Entry Points get triggered Implementing the functionality according to the design aka coding I presume, you´re familiar with phase 1 in some way. And I guess you´re proficient in implementing functionality in some programming language. But in my experience developers in general are not experienced in going through an explicit phase 2. “Designing functionality? What´s that supposed to mean?” you might already have thought. Here´s my definition: To design functionality (or functional design for short) means thinking about… well, functions. You find a solution for what´s supposed to happen when an Entry Point gets triggered in terms of functions. A conceptual solution that is, because those functions only exist in your head (or on paper) during this phase. But you may have guess that, because it´s “design” not “coding”. And here is, what functional design is not: It´s not about logic. Logic is expressions (e.g. +, -, && etc.) and control statements (e.g. if, switch, for, while etc.). Also I consider calling external APIs as logic. It´s equally basic. It´s what code needs to do in order to deliver some functionality or quality. Logic is what´s doing that needs to be done by software. Transformations are either done through expressions or API-calls. And then there is alternative control flow depending on the result of some expression. Basically it´s just jumps in Assembler, sometimes to go forward (if, switch), sometimes to go backward (for, while, do). But calling your own function is not logic. It´s not necessary to produce any outcome. Functionality is not enhanced by adding functions (subroutine calls) to your code. Nor is quality increased by adding functions. No performance gain, no higher scalability etc. through functions. Functions are not relevant to functionality. Strange, isn´t it. What they are important for is security of investment. By introducing functions into our code we can become more productive (re-use) and can increase evolvability (higher unterstandability, easier to keep code consistent). That´s no small feat, however. Evolvable code can hardly be overestimated. That´s why to me functional design is so important. It´s at the core of software development. To sum this up: Functional design is on a level of abstraction above (!) logical design or algorithmic design. Functional design is only done until you get to a point where each function is so simple you are very confident you can easily code it. Functional design an logical design (which mostly is coding, but can also be done using pseudo code or flow charts) are complementary. Software needs both. If you start coding right away you end up in a tangled mess very quickly. Then you need back out through refactoring. Functional design on the other hand is bloodless without actual code. It´s just a theory with no experiments to prove it. But how to do functional design? An example of functional design Let´s assume a program to de-duplicate strings. The user enters a number of strings separated by commas, e.g. a, b, a, c, d, b, e, c, a. And the program is supposed to clear this list of all doubles, e.g. a, b, c, d, e. There is only one Entry Point to this program: the user triggers the de-duplication by starting the program with the string list on the command line C:\>deduplicate "a, b, a, c, d, b, e, c, a" a, b, c, d, e …or by clicking on a GUI button. This leads to the Entry Point function to get called. It´s the program´s main function in case of the batch version or a button click event handler in the GUI version. That´s the physical Entry Point so to speak. It´s inevitable. What then happens is a three step process: Transform the input data from the user into a request. Call the request handler. Transform the output of the request handler into a tangible result for the user. Or to phrase it a bit more generally: Accept input. Transform input into output. Present output. This does not mean any of these steps requires a lot of effort. Maybe it´s just one line of code to accomplish it. Nevertheless it´s a distinct step in doing the processing behind an Entry Point. Call it an aspect or a responsibility - and you will realize it most likely deserves a function of its own to satisfy the Single Responsibility Principle (SRP). Interestingly the above list of steps is already functional design. There is no logic, but nevertheless the solution is described - albeit on a higher level of abstraction than you might have done yourself. But it´s still on a meta-level. The application to the domain at hand is easy, though: Accept string list from command line De-duplicate Present de-duplicated strings on standard output And this concrete list of processing steps can easily be transformed into code:static void Main(string[] args) { var input = Accept_string_list(args); var output = Deduplicate(input); Present_deduplicated_string_list(output); } Instead of a big problem there are three much smaller problems now. If you think each of those is trivial to implement, then go for it. You can stop the functional design at this point. But maybe, just maybe, you´re not so sure how to go about with the de-duplication for example. Then just implement what´s easy right now, e.g.private static string Accept_string_list(string[] args) { return args[0]; } private static void Present_deduplicated_string_list( string[] output) { var line = string.Join(", ", output); Console.WriteLine(line); } Accept_string_list() contains logic in the form of an API-call. Present_deduplicated_string_list() contains logic in the form of an expression and an API-call. And then repeat the functional design for the remaining processing step. What´s left is the domain logic: de-duplicating a list of strings. How should that be done? Without any logic at our disposal during functional design you´re left with just functions. So which functions could make up the de-duplication? Here´s a suggestion: De-duplicate Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Processing step 2 obviously was the core of the solution. That´s where real creativity was needed. That´s the core of the domain. But now after this refinement the implementation of each step is easy again:private static string[] Parse_string_list(string input) { return input.Split(',') .Select(s => s.Trim()) .ToArray(); } private static Dictionary<string,object> Compile_unique_strings(string[] strings) { return strings.Aggregate( new Dictionary<string, object>(), (agg, s) => { agg[s] = null; return agg; }); } private static string[] Serialize_unique_strings( Dictionary<string,object> dict) { return dict.Keys.ToArray(); } With these three additional functions Main() now looks like this:static void Main(string[] args) { var input = Accept_string_list(args); var strings = Parse_string_list(input); var dict = Compile_unique_strings(strings); var output = Serialize_unique_strings(dict); Present_deduplicated_string_list(output); } I think that´s very understandable code: just read it from top to bottom and you know how the solution to the problem works. It´s a mirror image of the initial design: Accept string list from command line Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Present de-duplicated strings on standard output You can even re-generate the design by just looking at the code. Code and functional design thus are always in sync - if you follow some simple rules. But about that later. And as a bonus: all the functions making up the process are small - which means easy to understand, too. So much for an initial concrete example. Now it´s time for some theory. Because there is method to this madness ;-) The above has only scratched the surface. Introducing Flow Design Functional design starts with a given function, the Entry Point. Its goal is to describe the behavior of the program when the Entry Point is triggered using a process, not an algorithm. An algorithm consists of logic, a process on the other hand consists just of steps or stages. Each processing step transforms input into output or a side effect. Also it might access resources, e.g. a printer, a database, or just memory. Processing steps thus can rely on state of some sort. This is different from Functional Programming, where functions are supposed to not be stateful and not cause side effects.[1] In its simplest form a process can be written as a bullet point list of steps, e.g. Get data from user Output result to user Transform data Parse data Map result for output Such a compilation of steps - possibly on different levels of abstraction - often is the first artifact of functional design. It can be generated by a team in an initial design brainstorming. Next comes ordering the steps. What should happen first, what next etc.? Get data from user Parse data Transform data Map result for output Output result to user That´s great for a start into functional design. It´s better than starting to code right away on a given function using TDD. Please get me right: TDD is a valuable practice. But it can be unnecessarily hard if the scope of a functionn is too large. But how do you know beforehand without investing some thinking? And how to do this thinking in a systematic fashion? My recommendation: For any given function you´re supposed to implement first do a functional design. Then, once you´re confident you know the processing steps - which are pretty small - refine and code them using TDD. You´ll see that´s much, much easier - and leads to cleaner code right away. For more information on this approach I call “Informed TDD” read my book of the same title. Thinking before coding is smart. And writing down the solution as a bunch of functions possibly is the simplest thing you can do, I´d say. It´s more according to the KISS (Keep It Simple, Stupid) principle than returning constants or other trivial stuff TDD development often is started with. So far so good. A simple ordered list of processing steps will do to start with functional design. As shown in the above example such steps can easily be translated into functions. Moving from design to coding thus is simple. However, such a list does not scale. Processing is not always that simple to be captured in a list. And then the list is just text. Again. Like code. That means the design is lacking visuality. Textual representations need more parsing by your brain than visual representations. Plus they are limited in their “dimensionality”: text just has one dimension, it´s sequential. Alternatives and parallelism are hard to encode in text. In addition the functional design using numbered lists lacks data. It´s not visible what´s the input, output, and state of the processing steps. That´s why functional design should be done using a lightweight visual notation. No tool is necessary to draw such designs. Use pen and paper; a flipchart, a whiteboard, or even a napkin is sufficient. Visualizing processes The building block of the functional design notation is a functional unit. I mostly draw it like this: Something is done, it´s clear what goes in, it´s clear what comes out, and it´s clear what the processing step requires in terms of state or hardware. Whenever input flows into a functional unit it gets processed and output is produced and/or a side effect occurs. Flowing data is the driver of something happening. That´s why I call this approach to functional design Flow Design. It´s about data flow instead of control flow. Control flow like in algorithms is of no concern to functional design. Thinking about control flow simply is too low level. Once you start with control flow you easily get bogged down by tons of details. That´s what you want to avoid during design. Design is supposed to be quick, broad brush, abstract. It should give overview. But what about all the details? As Robert C. Martin rightly said: “Programming is abot detail”. Detail is a matter of code. Once you start coding the processing steps you designed you can worry about all the detail you want. Functional design does not eliminate all the nitty gritty. It just postpones tackling them. To me that´s also an example of the SRP. Function design has the responsibility to come up with a solution to a problem posed by a single function (Entry Point). And later coding has the responsibility to implement the solution down to the last detail (i.e. statement, API-call). TDD unfortunately mixes both responsibilities. It´s just coding - and thereby trying to find detailed implementations (green phase) plus getting the design right (refactoring). To me that´s one reason why TDD has failed to deliver on its promise for many developers. Using functional units as building blocks of functional design processes can be depicted very easily. Here´s the initial process for the example problem: For each processing step draw a functional unit and label it. Choose a verb or an “action phrase” as a label, not a noun. Functional design is about activities, not state or structure. Then make the output of an upstream step the input of a downstream step. Finally think about the data that should flow between the functional units. Write the data above the arrows connecting the functional units in the direction of the data flow. Enclose the data description in brackets. That way you can clearly see if all flows have already been specified. Empty brackets mean “no data is flowing”, but nevertheless a signal is sent. A name like “list” or “strings” in brackets describes the data content. Use lower case labels for that purpose. A name starting with an upper case letter like “String” or “Customer” on the other hand signifies a data type. If you like, you also can combine descriptions with data types by separating them with a colon, e.g. (list:string) or (strings:string[]). But these are just suggestions from my practice with Flow Design. You can do it differently, if you like. Just be sure to be consistent. Flows wired-up in this manner I call one-dimensional (1D). Each functional unit just has one input and/or one output. A functional unit without an output is possible. It´s like a black hole sucking up input without producing any output. Instead it produces side effects. A functional unit without an input, though, does make much sense. When should it start to work? What´s the trigger? That´s why in the above process even the first processing step has an input. If you like, view such 1D-flows as pipelines. Data is flowing through them from left to right. But as you can see, it´s not always the same data. It get´s transformed along its passage: (args) becomes a (list) which is turned into (strings). The Principle of Mutual Oblivion A very characteristic trait of flows put together from function units is: no functional units knows another one. They are all completely independent of each other. Functional units don´t know where their input is coming from (or even when it´s gonna arrive). They just specify a range of values they can process. And they promise a certain behavior upon input arriving. Also they don´t know where their output is going. They just produce it in their own time independent of other functional units. That means at least conceptually all functional units work in parallel. Functional units don´t know their “deployment context”. They now nothing about the overall flow they are place in. They are just consuming input from some upstream, and producing output for some downstream. That makes functional units very easy to test. At least as long as they don´t depend on state or resources. I call this the Principle of Mutual Oblivion (PoMO). Functional units are oblivious of others as well as an overall context/purpose. They are just parts of a whole focused on a single responsibility. How the whole is built, how a larger goal is achieved, is of no concern to the single functional units. By building software in such a manner, functional design interestingly follows nature. Nature´s building blocks for organisms also follow the PoMO. The cells forming your body do not know each other. Take a nerve cell “controlling” a muscle cell for example:[2] The nerve cell does not know anything about muscle cells, let alone the specific muscel cell it is “attached to”. Likewise the muscle cell does not know anything about nerve cells, let a lone a specific nerve cell “attached to” it. Saying “the nerve cell is controlling the muscle cell” thus only makes sense when viewing both from the outside. “Control” is a concept of the whole, not of its parts. Control is created by wiring-up parts in a certain way. Both cells are mutually oblivious. Both just follow a contract. One produces Acetylcholine (ACh) as output, the other consumes ACh as input. Where the ACh is going, where it´s coming from neither cell cares about. Million years of evolution have led to this kind of division of labor. And million years of evolution have produced organism designs (DNA) which lead to the production of these different cell types (and many others) and also to their co-location. The result: the overall behavior of an organism. How and why this happened in nature is a mystery. For our software, though, it´s clear: functional and quality requirements needs to be fulfilled. So we as developers have to become “intelligent designers” of “software cells” which we put together to form a “software organism” which responds in satisfying ways to triggers from it´s environment. My bet is: If nature gets complex organisms working by following the PoMO, who are we to not apply this recipe for success to our much simpler “machines”? So my rule is: Wherever there is functionality to be delivered, because there is a clear Entry Point into software, design the functionality like nature would do it. Build it from mutually oblivious functional units. That´s what Flow Design is about. In that way it´s even universal, I´d say. Its notation can also be applied to biology: Never mind labeling the functional units with nouns. That´s ok in Flow Design. You´ll do that occassionally for functional units on a higher level of abstraction or when their purpose is close to hardware. Getting a cockroach to roam your bedroom takes 1,000,000 nerve cells (neurons). Getting the de-duplication program to do its job just takes 5 “software cells” (functional units). Both, though, follow the same basic principle. Translating functional units into code Moving from functional design to code is no rocket science. In fact it´s straightforward. There are two simple rules: Translate an input port to a function. Translate an output port either to a return statement in that function or to a function pointer visible to that function. The simplest translation of a functional unit is a function. That´s what you saw in the above example. Functions are mutually oblivious. That why Functional Programming likes them so much. It makes them composable. Which is the reason, nature works according to the PoMO. Let´s be clear about one thing: There is no dependency injection in nature. For all of an organism´s complexity no DI container is used. Behavior is the result of smooth cooperation between mutually oblivious building blocks. Functions will often be the adequate translation for the functional units in your designs. But not always. Take for example the case, where a processing step should not always produce an output. Maybe the purpose is to filter input. Here the functional unit consumes words and produces words. But it does not pass along every word flowing in. Some words are swallowed. Think of a spell checker. It probably should not check acronyms for correctness. There are too many of them. Or words with no more than two letters. Such words are called “stop words”. In the above picture the optionality of the output is signified by the astrisk outside the brackets. It means: Any number of (word) data items can flow from the functional unit for each input data item. It might be none or one or even more. This I call a stream of data. Such behavior cannot be translated into a function where output is generated with return. Because a function always needs to return a value. So the output port is translated into a function pointer or continuation which gets passed to the subroutine when called:[3]void filter_stop_words( string word, Action<string> onNoStopWord) { if (...check if not a stop word...) onNoStopWord(word); } If you want to be nitpicky you might call such a function pointer parameter an injection. And technically you´re right. Conceptually, though, it´s not an injection. Because the subroutine is not functionally dependent on the continuation. Firstly continuations are procedures, i.e. subroutines without a return type. Remember: Flow Design is about unidirectional data flow. Secondly the name of the formal parameter is chosen in a way as to not assume anything about downstream processing steps. onNoStopWord describes a situation (or event) within the functional unit only. Translating output ports into function pointers helps keeping functional units mutually oblivious in cases where output is optional or produced asynchronically. Either pass the function pointer to the function upon call. Or make it global by putting it on the encompassing class. Then it´s called an event. In C# that´s even an explicit feature.class Filter { public void filter_stop_words( string word) { if (...check if not a stop word...) onNoStopWord(word); } public event Action<string> onNoStopWord; } When to use a continuation and when to use an event dependens on how a functional unit is used in flows and how it´s packed together with others into classes. You´ll see examples further down the Flow Design road. Another example of 1D functional design Let´s see Flow Design once more in action using the visual notation. How about the famous word wrap kata? Robert C. Martin has posted a much cited solution including an extensive reasoning behind his TDD approach. So maybe you want to compare it to Flow Design. The function signature given is:string WordWrap(string text, int maxLineLength) {...} That´s not an Entry Point since we don´t see an application with an environment and users. Nevertheless it´s a function which is supposed to provide a certain functionality. The text passed in has to be reformatted. The input is a single line of arbitrary length consisting of words separated by spaces. The output should consist of one or more lines of a maximum length specified. If a word is longer than a the maximum line length it can be split in multiple parts each fitting in a line. Flow Design Let´s start by brainstorming the process to accomplish the feat of reformatting the text. What´s needed? Words need to be assembled into lines Words need to be extracted from the input text The resulting lines need to be assembled into the output text Words too long to fit in a line need to be split Does sound about right? I guess so. And it shows a kind of priority. Long words are a special case. So maybe there is a hint for an incremental design here. First let´s tackle “average words” (words not longer than a line). Here´s the Flow Design for this increment: The the first three bullet points turned into functional units with explicit data added. As the signature requires a text is transformed into another text. See the input of the first functional unit and the output of the last functional unit. In between no text flows, but words and lines. That´s good to see because thereby the domain is clearly represented in the design. The requirements are talking about words and lines and here they are. But note the asterisk! It´s not outside the brackets but inside. That means it´s not a stream of words or lines, but lists or sequences. For each text a sequence of words is output. For each sequence of words a sequence of lines is produced. The asterisk is used to abstract from the concrete implementation. Like with streams. Whether the list of words gets implemented as an array or an IEnumerable is not important during design. It´s an implementation detail. Does any processing step require further refinement? I don´t think so. They all look pretty “atomic” to me. And if not… I can always backtrack and refine a process step using functional design later once I´ve gained more insight into a sub-problem. Implementation The implementation is straightforward as you can imagine. The processing steps can all be translated into functions. Each can be tested easily and separately. Each has a focused responsibility. And the process flow becomes just a sequence of function calls: Easy to understand. It clearly states how word wrapping works - on a high level of abstraction. And it´s easy to evolve as you´ll see. Flow Design - Increment 2 So far only texts consisting of “average words” are wrapped correctly. Words not fitting in a line will result in lines too long. Wrapping long words is a feature of the requested functionality. Whether it´s there or not makes a difference to the user. To quickly get feedback I decided to first implement a solution without this feature. But now it´s time to add it to deliver the full scope. Fortunately Flow Design automatically leads to code following the Open Closed Principle (OCP). It´s easy to extend it - instead of changing well tested code. How´s that possible? Flow Design allows for extension of functionality by inserting functional units into the flow. That way existing functional units need not be changed. The data flow arrow between functional units is a natural extension point. No need to resort to the Strategy Pattern. No need to think ahead where extions might need to be made in the future. I just “phase in” the remaining processing step: Since neither Extract words nor Reformat know of their environment neither needs to be touched due to the “detour”. The new processing step accepts the output of the existing upstream step and produces data compatible with the existing downstream step. Implementation - Increment 2 A trivial implementation checking the assumption if this works does not do anything to split long words. The input is just passed on: Note how clean WordWrap() stays. The solution is easy to understand. A developer looking at this code sometime in the future, when a new feature needs to be build in, quickly sees how long words are dealt with. Compare this to Robert C. Martin´s solution:[4] How does this solution handle long words? Long words are not even part of the domain language present in the code. At least I need considerable time to understand the approach. Admittedly the Flow Design solution with the full implementation of long word splitting is longer than Robert C. Martin´s. At least it seems. Because his solution does not cover all the “word wrap situations” the Flow Design solution handles. Some lines would need to be added to be on par, I guess. But even then… Is a difference in LOC that important as long as it´s in the same ball park? I value understandability and openness for extension higher than saving on the last line of code. Simplicity is not just less code, it´s also clarity in design. But don´t take my word for it. Try Flow Design on larger problems and compare for yourself. What´s the easier, more straightforward way to clean code? And keep in mind: You ain´t seen all yet ;-) There´s more to Flow Design than described in this chapter. In closing I hope I was able to give you a impression of functional design that makes you hungry for more. To me it´s an inevitable step in software development. Jumping from requirements to code does not scale. And it leads to dirty code all to quickly. Some thought should be invested first. Where there is a clear Entry Point visible, it´s functionality should be designed using data flows. Because with data flows abstraction is possible. For more background on why that´s necessary read my blog article here. For now let me point out to you - if you haven´t already noticed - that Flow Design is a general purpose declarative language. It´s “programming by intention” (Shalloway et al.). Just write down how you think the solution should work on a high level of abstraction. This breaks down a large problem in smaller problems. And by following the PoMO the solutions to those smaller problems are independent of each other. So they are easy to test. Or you could even think about getting them implemented in parallel by different team members. Flow Design not only increases evolvability, but also helps becoming more productive. All team members can participate in functional design. This goes beyon collective code ownership. We´re talking collective design/architecture ownership. Because with Flow Design there is a common visual language to talk about functional design - which is the foundation for all other design activities. PS: If you like what you read, consider getting my ebook “The Incremental Architekt´s Napkin”. It´s where I compile all the articles in this series for easier reading. I like the strictness of Function Programming - but I also find it quite hard to live by. And it certainly is not what millions of programmers are used to. Also to me it seems, the real world is full of state and side effects. So why give them such a bad image? That´s why functional design takes a more pragmatic approach. State and side effects are ok for processing steps - but be sure to follow the SRP. Don´t put too much of it into a single processing step. ? Image taken from www.physioweb.org ? My code samples are written in C#. C# sports typed function pointers called delegates. Action is such a function pointer type matching functions with signature void someName(T t). Other languages provide similar ways to work with functions as first class citizens - even Java now in version 8. I trust you find a way to map this detail of my translation to your favorite programming language. I know it works for Java, C++, Ruby, JavaScript, Python, Go. And if you´re using a Functional Programming language it´s of course a no brainer. ? Taken from his blog post “The Craftsman 62, The Dark Path”. ?

Read the article
You Might Be a DBA

- by BuckWoody

With all apologies to Jeff Foxworthy, I was up late Friday night on a holiday weekend (which translated into T-SQL becomes “Maintenance Window”) and I got bored in between the two or three minutes I had between clicks. So I started a “Twitter” meme – and it just took off. I haven’t cleaned these up much, but here, in author order as of Saturday the 29th of May is the list “You might be a DBA” from around the Twitterverse: buckwoody Your two main enemies are developers and SAN admins #youmightbeaDBA buckwoody People can use Access as a cross or garlic on you #youmightbeaDBA buckwoody You always plan an exit strategy, even when entering a McDonald's #youmightbeaDBA buckwoody You can't explain to your family what you really do for a living #youmightbeaDBA buckwoody You have at least one set of scripts you won't share #youmightbeaDBA buckwoody You have an opinion on the best code-beautifier #youmightbeaDBA buckwoody You have children older than the rest of your team #youmightbeaDBA buckwoody You and the Oracle DBA would kill each other, but you'll happily fight off a developer together first #youmightbeaDBA buckwoody You've threatened to quit if they give anyone the sa password on production #youmightbeaDBA buckwoody You've sent a vendor suggestions on improving their database design or code (and been ignored) #youmightbeaDBA buckwoody You've sent a vendor suggestions on improving their database design or code (and been ignored) #youmightbeaDBA buckwoody You have an opinion on the best code-beautifier #youmightbeaDBA buckwoody You have at least one set of scripts you won't share #youmightbeaDBA buckwoody You refer to co-workers as "carbon-units" #youmightbeaDBA buckwoody Being paranoid is on your resume at the top #youmightbeaDBA buckwoody Everyone comes to your cube to find the MSDN DVD's #youmightbeaDBA buckwoody You always plan an exit strategy, even when entering a McDonald's #youmightbeaDBA buckwoody You've worn down developers to get your way by explaining normalization levels #youmightbeaDBA buckwoody You refer to clothes as "Data Abstractions" #youmightbeaDBA buckwoody Users pester you to be able to put data in a database, then they pester you to take it out and put it in Excel #youmightbeaDBA buckwoody Others try to de-duplicate data, you try to copy it to more than three locations #youmightbeaDBA buckwoody You have at least one DLT tape in the trunk of your car #youmightbeaDBA buckwoody You use twitter and facebook to talk with colleagues because there's no one else in your company that does what you do #youmightbeaDBA buckwoody Your spouse knows what "ETL" means #youmightbeaDBA buckwoody You've referred to yourself as the "Data Janitor" #youmightbeaDBA buckwoody You don't have positive connotations of the word "upgrade" #youmightbeaDBA buckwoody You get your coffee before you check your servers, because you know you won't get any if you don't #youmightbeaDBA buckwoody You always come to work through the back door so no one hijacks you on the way to your cube #youmightbeaDBA buckwoody You check your server logs before you check your e-mail in the morning so you can reply "Yeah, I already fixed that." #youmightbeaDBA buckwoody You have more conference badges than clean socks #youmightbeaDBA buckwoody Your coffee mug says "It depends" #youmightbeaDBA buckwoody You can convince a boss that you need 16GB of RAM in your laptop #youmightbeaDBA buckwoody You've used ebay to find production equipment #youmightbeaDBA buckwoody You pad all project timelines by 2X, and you still miss them #youmightbeaDBA buckwoody You know when your company is acquiring another even before the CFO #youmightbeaDBA buckwoody You pad all project timelines by 2X, and you still miss them #youmightbeaDBA buckwoody You call aspirin "work vitamins" #youmightbeaDBA buckwoody You get the same amount of sleep even after you have a child #youmightbeaDBA buckwoody You obsess about performance metrics from over one year ago #youmightbeaDBA buckwoody The first thing you buy after the database software is aftermarket tools to manage the database software #youmightbeaDBA buckwoody You've tried to convince someone else to become a DBA #youmightbeaDBA buckwoody You use twitter and facebook to talk with colleagues because there's no one else in your company that does what you do #youmightbeaDBA buckwoody You only know other DBA's by their Tweet Handle #youmightbeaDBA buckwoody You've explained the difference between 32 and 64-bit to more than one manager in terms they can understand, using puppets #youmightbeaDBA buckwoody Your two main enemies are developers and SAN admins #youmightbeaDBA buckwoody You've driven to the Datacenter to install SQL Server because "you don't trust those NOC admins" #youmightbeaDBA buckwoody You pay more for faster Internet connections than cable at home so you don't have to drive in #youmightbeaDBA buckwoody You call texting a "queuing system" #youmightbeaDBA buckwoody You know that if someone can read Perl, they manage an Oracle system #youmightbeaDBA buckwoody You have an e-mail rule for backup notifications #youmightbeaDBA buckwoody Your food pyramid includes coffee, salt and fat #youmightbeaDBA buckwoody You wish everything had a graphical query plan #youmightbeaDBA buckwoody You refactor your e-mails #youmightbeaDBA buckwoody You've gotten more help from twitter and facebook than all your years in college #youmightbeaDBA buckwoody You would pay money for a license plate that has the letters S-Q-L together #youmightbeaDBA buckwoody You have actually considered making a RAID array from thumb drives #youmightbeaDBA buckwoody Everything on your laptop is installed from your MSDN subscription #youmightbeaDBA buckwoody You've written blog posts on technology you've never actually implemented in production #youmightbeaDBA buckwoody Everything on your laptop is installed from your MSDN subscription #youmightbeaDBA buckwoody @MidnightDBA Click the #youmightbeaDBA tag. I've had WAY too much coffee today. buckwoody There is no other position that is 1-deep except you and the CEO #youmightbeaDBA buckwoody When you watch "The Office" you call it "OJT" #youmightbeaDBA buckwoody You would pay money for a license plate that has the letters S-Q-L together #youmightbeaDBA buckwoody Your blog would make a "best practices" or "worst practices" book #youmightbeaDBA buckwoody You have actually considered making a RAID array from thumb drives #youmightbeaDBA buckwoody The first thing you install on your netbook is SSMS #youmightbeaDBA buckwoody Everything on your laptop is installed from your MSDN subscription #youmightbeaDBA buckwoody Your watch is set to UTC because it's just easier #youmightbeaDBA buckwoody You make plenty of money, but you're excited to get a $2.00 squeeze-ball from Quest and Redgate #youmightbeaDBA buckwoody You make plenty of money, but you're excited to get a $2.00 squeeze-ball from Quest and Redgate #youmightbeaDBA buckwoody You think data can be represented as something OTHER than XML #youmightbeaDBA buckwoody You tell people that you made a database query go faster, and expect them to be happy for you #youmightbeaDBA buckwoody You take the word "NoSQL" as a personal attack #youmightbeaDBA buckwoody People can use Access as a cross or garlic on you #youmightbeaDBA buckwoody * == bad #youmightbeaDBA buckwoody * == bad #youmightbeaDBA buckwoody There are just as many females in your technical field as males #youmightbeaDBA buckwoody People can use Access as a cross or garlic on you #youmightbeaDBA buckwoody You've gotten more help from twitter and facebook than all your years in college #youmightbeaDBA buckwoody You think that something OTHER than the database might be the performance bottleneck #youmightbeaDBA buckwoody You refer to time as a "Clustered Index" #youmightbeaDBA buckwoody You know why "user" refers to both business people and crack addicts #youmightbeaDBA buckwoody You make plenty of money, but you're excited to get a $2.00 squeeze-ball from Quest and Redgate #youmightbeaDBA buckwoody You can't explain to your family what you really do for a living #youmightbeaDBA buckwoody You tell people that you made a database query go faster, and expect them to be happy for you #youmightbeaDBA buckwoody You think a millisecond is a really long time #youmightbeaDBA buckwoody You're sitting and typing #youmightbeaDBA when you could be outside #youmightbeaDBA buckwoody You can't wait for a technical conference so you can wear a kilt - and you're not Scottish #youmightbeaDBA buckwoody You know that "DBA" stands for "Default Blame Acceptor" #youmightbeaDBA buckwoody People can use Access as a cross or garlic on you #youmightbeaDBA buckwoody You know what "the truth, thole truth and nothing but the truth, so help me Codd" means #youmightbeaDBA buckwoody You've gotten more help from twitter and facebook than all your years in college #youmightbeaDBA buckwoody You can't talk fast enough to get a concept out of your head so you tweet it instead #youmightbeaDBA buckwoody You cry when someone doesn't use a WHERE clause #youmightbeaDBA buckwoody You think data can be represented as something OTHER than XML #youmightbeaDBA buckwoody You think "Set theory" is not an verb but a noun #youmightbeaDBA buckwoody You try to convince random strangers to vote on your Connect item #youmightbeaDBA buckwoody You think 3 hours of contiguous sleep is a good thing #youmightbeaDBA or #youmightbeamother buckwoody You don't like Oracle, and not just because of what she did to Neo #youmightbeaDBA buckwoody You know when to say "sequel" and "s-q-l" #youmightbeaDBA buckwoody You know where the data is #youmightbeaDBA buckwoody You refer to your children as "Fully Redundant Mirrors" #youmightbeaDBA buckwoody Holiday == "Maintenance Window" #youmightbeaDBA buckwoody Your laptop is more powerful than the servers in most companies - including your own #youmightbeaDBA buckwoody You capitalize SELECTed words #youmightbeaDBA buckwoody You take the word "NoSQL" as a personal attack #youmightbeaDBA buckwoody You know why "user" refers to both business people and crack addicts #youmightbeaDBA buckwoody You cringe in public when the word "upgrade" is used in a sentence #youmightbeaDBA buckwoody Holiday == "Maintenance Window" #youmightbeaDBA buckwoody All Data Is MetaData means something to you #youmightbeaDBA buckwoody You've never seen the driveway to your house in the daylight #youmightbeaDBA buckwoody You think that something OTHER than the database might be the performance bottleneck #youmightbeaDBA buckwoody Most of your bloodstream is composed of caffeine #youmightbeaDBA buckwoody Your task list is labeled "CRUD Matrix" #youmightbeaDBA buckwoody You call your wife/husband a "Linked Server" #youmightbeaDBA anonythemouse When someone tells you they are going to take a dump and you wonder of which database then #youmightbeaDBA anonythemouse When it's 11pm on a holiday weekend and you are working #youmightbeaDBA anonythemouse When you sit down at a table and look for it's primary key #youmightbeaDBA anonythemouse When getting milk from the fridge you check the expiry date is > getdate() #youmightbeaDBA blakmk when you wake up dreaming about sql #youmightbeaDBA CharlesGarver You think a @buckwoody bobblehead would be a cool thing to have on the dashboard of your car #youmightbeaDBA CharlesGarver Your friends don't understand why you think there's a difference between single and double quotes #youmightbeaDBA CharlesGarver Even the newest employees know your name from all the downtime notices you've sent out #youmightbeaDBA CharlesGarver You sometimes feel anxious and think "I should test restoring those backups" and then the feeling passes #youmightbeadba CharlesGarver You know what a co-worker means when they ask "how is your squirrel server?" #youmightbeadba CharlesGarver You can't sleep at night and you ponder the logisitcs of collecting every copy of Access for the world's biggest bonfire #youmightbeaDBA CharlesGarver You can't sleep at night and you ponder the logisitcs of collecting every copy of Access for the world's biggest bonfire #youmightbeaDBA CharlesGarver You're willing to move someone's job up in priority for a box of #voodoodonuts #youmightbeaDBA CharlesGarver Each person in your company seems to think you work for THEM #youmightbeaDBA CharlesGarver You have a Love/Hate relationship going on with #Microsoft #youmightbeaDBA CharlesGarver People ask you to troubleshoot their Access program #youmightbeaDBA CharlesGarver The first words you hear in the morning are 'your voicemail box is full' #youmightbeaDBA CharlesGarver The thought of disrupting 500 people's work so you can do something doesn't phase you #youmightbeaDBA CharlesGarver You can't sleep at night and you ponder the logisitcs of collecting every copy of Access for the world's biggest bonfire #youmightbeaDBA CharlesGarver Your home computer is backed up in 3 different places #youmightbeaDBA CharlesGarver Your wardrobe for work includes pajamas #youmightbeaDBA CharlesGarver Someone tells you to look in the INDEX and you look puzzled before finally going to the back of the book. #youmightbeaDBA chuckboycejr If you have ever set up a SQLAgent job to email your mobile phone to serve as an alarm clock #youmightbeaDBA chuckboycejr If you'd rather meet Itzik than Jay Z #youmightbeaDBA chuckboycejr If you'd rather meet Itzik than Jay Z #youmightbeaDBA chuckboycejr If you'd wrestle a SysAdmin to the ground to implement #DPA best practices as per @aspiringgeek #youmightbeaDBA databaseguy I need to be up in 7 hours, so I'm off to bed! I'll have to read the rest of @buckwoody's #youmightbeaDBA posts in the AM. (g'night Buck!) databaseguy When people ask you about your house, the first thing you describe is the network. #youmightbeaDBA databaseguy The last thing you say at the office each day is, "is anybody else here? I'm shutting off the lights!" #youmightbeaDBA databaseguy Your blood pressure rises when you read application specs drafted by marketing. #youmightbeaDBA databaseguy A good day at work is one when nobody pays you no mind. #youmightbeaDBA databaseguy You care about latches and wait states. #youmightbeaDBA databaseguy You have worked over 200 hours on a performance tuning project that required no application changes at all. #youmightbeaDBA databaseguy The late-night security guard knows the names of your spouse and kids. #youmightbeaDBA databaseguy You have had vigorous debates about whether it should be pronounced "sequel" or "ess-queue-ell". #youmightbeaDBA databaseguy You have VPN and RDP software installed on your phone ... just in case. #youmightbeaDBA databaseguy You have edited a data file by hand, just to see what would happen. #youmightbeaDBA databaseguy You decorate your office walls with database catalog posters. #youmightbeaDBA databaseguy You've built programs that access data just to keep other developers from asking you to run queries all the time. #youmightbeaDBA databaseguy When you watch movies like The Matrix, you find yourself calculating the fasibility of storing all that data. #youmightbeaDBA databaseguy You have tried to convince someone to spend money on an SSD storage array. #youmightbeaDBA databaseguy When CPU is spiked on a server, you want to gather forensic evidence. #youmightbeaDBA databaseguy You have to remind developers not to push code to production without checking if the database is ready. #youmightbeaDBA databaseguy Nobody cares what you wear to work, as long as the thing keeps running. #youmightbeaDBA databaseguy Telepathy is a job requirement when working with app dev teams. #youmightbeaDBA databaseguy You read database statistics for the educational value. #youmightbeaDBA databaseguy And your boss freely admits this to anyone within earshot. #youmightbeaDBA databaseguy Your boss cannot explain or understand what you do. #youmightbeaDBA databaseguy You envision ERDs when you see a GUI. #youmightbeaDBA databaseguy You say things like "applications come and go, but data lasts forever." #youmightbeaDBA databaseguy You have memorized the names of several of the AdventureWorks employees. #youmightbeaDBA databaseguy You know what MAXDOP setting you can get away with for a big query based on current server load. #youmightbeaDBA databaseguy And you immediately recognize the recursion in my last tweet. #youmightbeaDBA databaseguy You find 50 simultaneous tweets from @buckwoody about #youmightbeaDBA :O) DBAishness You have "funny stories" about the times your developers accidentally deleted the T-log in their test environment. #youmightbeaDBA DBAishness Planning to slice and dice your MDW data with PowerPivot makes you giggle like a schoolgirl. #youmightbeaDBA donalddotfarmer You think @buckwoody lives in the "real world." #youmightbeaDBA jamach09 @buckwoody #youmightbeaDBA Why go outside when you can sit in the nice cool server room? jamach09 If you refer to procreation as "Replication", #youmightbeaDBA. jamach09 If you think ORM is a four-letter word, #youmightbeaDBA JamesMarsh If you have ever preached the value of Source Code Control, #YouMightBeADBA jethrocarr @venzann You store your shopping list in a ACID compliant DB #youmightbeaDBA joe_positive @buckwoody thought it stood for "Don't Bother Asking" #youmightbeaDBA joe_positive when you check your IT Events Calendar before making weekend plans #youmightbeaDBA LadyRuna You cringe whenever someone calls Excel a database #youmightbeaDBA LadyRuna When the waiter says he'll be your server today, you ask how many terabytes he is #youmightbeaDBA LadyRuna you always call the asterisk a "Star" #youmightbeaDBA LadyRuna You walk into a server room, say "Nice RACK!" and everyone there knows you're talking about server rack... #youmightbeaDBA LadyRuna You receive more messages from servers than from friends #youmightbeaDBA LadyRuna hmmm... #youmightbeaDBA if your recipe for gumbo is "SELECT * FROM Refrigerator" markjholmes @SQLSoldier Heh. #youmightbeaDBA if you correct other DBAs' spelling of @PaulRandal markjholmes #youmightbeaDBA if you actually test RAID5 vs RAID10 on your SAN because when it comes to configuration, "it depends." markjholmes #youmightbeaDBA if you have at least 3 definitions of the word "cluster" MarlonRibunal 3 Words: @BrentO, snicker, & Access #youmightbeaDBA MarlonRibunal @onpnt @mikeSQL my appeal was a couple of mins late. Enjoying #youmightbeaDBA MarlonRibunal @mikeSQL @onpnt pls, don't mention bacon #youmightbeaDBA merv @buckwoody You HATE 3-way joins #youmightbeaDBA MidnightDBA If you're up at midnight Tweeting about SQL #youmightbeaDBA MidnightDBA @buckwoody I'd noticed that. :) #youmightbeaDBA mikeSQL when people talk about "their type" you're thinking varchar, bigint, binary, etc #youmightbeadba mikeSQL people ask you to go to lunch , but you can't go because you're attending #SQLlunch #youmightbeadba mikeSQL you laugh for hours at all of the #sqlmoviequotes ....things in which a normal individual would scratch their head at. #youmightbeadba mikeSQL you laugh for hours at all of the #sqlmoviequotes ....things in which a normal individual would scratch their head at. #youmightbeadba mrdenny If you think that @buckwoody's demo using PowerPivot to analyze index usage data from DMVs is awesome then #youmightbeaDBA mrdenny You wish @PaulRandal still worked at Microsoft so that they would make a bobble head of him #youmightbeadba mrdenny When it's 11pm on a holiday weekend, and your posting stupid jokes on Twitter then #youmightbeadba mrdenny If you go out with friends and wonder why no one's wearing a kilt then #YouMightBeADBA mrdenny You can't do basic math, but you know off the top of your head how many CALs $14,412 can buy you. #YoumightbeaDBA mrdenny If you've ever setup a SQL Job to email you to get you out of a regularly scheduled meeting #YouMightBeADBA. mrdenny You throw up in your mouth a little when ever you here the word "Access". Even if it doesn't relate to a MS product. #YouMightBeADBA msdtjones You spend more time listening to @buckwoody than your wife #youmightbeaDBA NFDotCom You perform "hail deltas" on a regular basis. #YouMightBeADBA NoelMcKinney If you tell your wife you want to go to Columbus Ohio for your wedding anniversary so you can attend #sqlsat42 then #youmightbeaDBA NoelMcKinney You read a union is on strike and wonder if it's a UNION ALL #youmightbeaDBA NoelMcKinney You read a union is on strike and wonder if it's a UNION ALL #youmightbeaDBA NoelMcKinney Someone asks you to throw another log on the fire and you tell them not to worry about it because Autogrowth is turned on #youmightbeaDBA Nuurdygirl Even if you have a girlfriend...its possible #youmightbeadba. Yeah-i said its possible! Nuurdygirl When your girlfriend has to lean around the laptop to kiss you goodnight #youmightbeadba Old_Man_Fish If you worry about how big your package is and how long it takes to finish #youmightbeaDBA Old_Man_Fish If you no longer wonder if someone is in trouble or died if you are getting calls at 2AM #youmightbeaDBA Old_Man_Fish If, when you hear the word ACCESS with no connotation you blood pressure jumps 50 points, #youmightbeaDBA onpnt When you hear the word inject you immediately get concerned if your databases are OK #youmightbeaDBA onpnt Your servers haven't been rebooted in a year #youmightbeaDBA onpnt You know why it's funny when @PaulRandal has the word, "Sheep" in a tweet #youmightbeaDBA onpnt You have read BOL without actually having a problem to figure out #youmightbeaDBA onpnt You can type "SELECT columns FROM tables" without typos but tipen ni Banglish ares a messis #youmightbeaDBA onpnt DR strategies doesn't include the word, RAID in them #youmightbeaDBA onpnt you can move a SQL Server instance to a new server without the users ever knowing #youmightbeaDBA onpnt You have made an SSIS package that is more than one step #youmightbeaDBA onpnt You have the balls to say no to your boss when they ask for the sa password #youmightbeaDBA onpnt you google to trouble shoot a problem and end up at your own blog (and it fixes it) #youmightbeaDBA onpnt You talk your wife into moving the family vacation a week earlier so you can attend the areas local SSUG meeting #youmightbeaDBA onpnt you can explain to a nontechnical person what a deadlock is #youmightbeaDBA onpnt You hope a girl asks you what your collation is #youmightbeaDBA onpnt you make jokes that include the words shrink, truncate and 1205. And you are the only one that laughs at them #youmightbeaDBA onpnt You rate your ability to stay awake to work longer on blogs, twitter, forums and your day to day job with the 5 9's goal #youmightbeaDBA onpnt you have major surgery and beg the doctor to release you back to work 5 days later because you miss your servers #youmightbeaDBA #TrueStory onpnt You do have backups and you know how to use them #youmightbeaDBA onpnt It's the network #youmightbeaDBA onpnt When the developers get to work your mood changes rapidly #youmightbeaDBA onpnt When someone says, "PASS", you first think of karaoke #youmightbeaDBA onpnt Recruiters try to get you to call them *just* because they think you'll give them @BrentO contact info #youmightbeaDBA onpnt You chuckle every time you go to grab the "CLR" Calcium, Lime and Rust Remover to clean something #youmightbeaDBA onpnt @MarlonRibunal @mikeSQL Sorry man, it was already in motion ;-) #youmightbeaDBA onpnt When you have an "I love bacon" sticker on your laptop. #youmightbeaDBA http://twitpic.com/1ry671 onpnt You sing SELECT statements in the shower #youmightbeaDBA onpnt When you see a chicken it doesn't remind you of food. It reminds you of a guy named Jorge #youmightbeaDBA onpnt At time, SQL is your mistress #youmightbeaDBA onpnt Your wife wonders if SQL is the code name of your mistress at times #youmightbeaDBA onpnt it's Friday and you are on twitter thinking really hard about what would be funny for hash tag #youmightbeaDBA onpnt You organize your wife's "decorative"pillows on the bed in a B-Tree structure #youmightbeaDBA PaulWhiteNZ If you: SELECT TOP (1) milk FROM fridge WHERE use_by_date >= GET_DATE() ORDER BY use_by_date ASC #YouMightBeaDBA RonDBA #youmightbeaDBA if you read @buckwoody's and @BrentO's blogs. ryaneastabrook @buckwoody omg, you have to stand up a website with these on them, they are awesome #youmightbeaDBA soulvy @StrateSQL @LadyRuna Or a "Splat" #youmightbeaDBA speedracer You can still fall asleep after three cups of coffee #youmightbeaDBA speedracer You retweet @buckwoody on a Friday night #youmightbeaDBA speedracer You can still fall asleep after three cups of coffee #youmightbeaDBA speedracer Developers make you twitch #youmightbeaDBA sqlagentman You know what X/1024*8 is. #YouMightBeADBA SqlAsylum Your still in the office at 5:00 on memorial day weekend. #youmightbeadba :) SQLBob Whenever someone you know gets pregnant you bring up INNER JOINs or SQL Injection attacks... #youmightbeaDBA SQLChicken You know one or more SQL folks in the community with an animal in their username #youmightbeaDBA SQLChicken You've used one or more car analogies to explain how a database works #youmightbeaDBA SQLChicken “@sqljoe: #youmightbeaDBA if you applied to attend #sqlu and requested @SQLChicken to pull strings for you” lmao nice! SQLChicken When talking about SSIS your discussions break down into various jokes about packages #youmightbeaDBA SQLChicken Just SEEING the code for cursors makes you break out in hives #youmightbeaDBA SQLChicken Just SEEING the code for cursors makes you break out in hives #youmightbeaDBA SQLCraftsman You coined the phrase "Magic SAN Dust" because calling a vendor's marketing claims BS is not acceptable in a meeting. #YouMightBeADBA SQLCraftsman If you hear about a new feature with the acronym "DAC" and wonder what disaster of a feature it is attached to this time. #YouMightBeADBA SQLCraftsman You really own a "Stick of Much Developer Whacking" #YouMightBeADBA SQLCraftsman You coined the phrase "Magic SAN Dust" because calling a vendor's marketing claims BS is not acceptable in a meeting. #YouMightBeADBA SQLCraftsman Default Blame Acceptor #YouMightBeADBA SQLCraftsman If you hear about a new feature with the acronym "DAC" and wonder what disaster of a feature it is attached to this time. #YouMightBeADBA SQLCraftsman Default Blame Acceptor #YouMightBeADBA SQLCraftsman If you hear about a new feature with the acronym "DAC" and wonder what disaster of a feature it is attached to this time. #YouMightBeADBA sqljoe #youmightbeaDBA if you wished your wife knew T-sql. USE ShoppingList SELECT NecessaryItems from Supermarket WHERE Category<> ("junk food") sqljoe #youmightbeaDBA if the first thing you kiss when you wake up is your mobile for not waking you up in the middle of the night sqljoe #youmightbeaDBA if your wife has a "Do Not Fly" family vacation list of her own including your laptop and mobile sqljoe #youmightbeaDBA if you have researched for DBA Anonymous groups and attended a #SSUG willing to drop your database (vice) sqljoe #youmightbeaDBA if your only maintenance windows are staff meetings sqljoe #youmightbeaDBA if you think of yourself as "The One" in The Matrix "balancing the equation" from The Architect's (developers) poor coding sqljoe #youmightbeaDBA if you think @PaulRandal should have played the Oracle in The Matrix sqljoe #youmightbeaDBA if home CD & Movie collection is stored in secured containers,in logical order & naming convention,and with a backup copy sqljoe #youmightbeaDBA if you applied to attend #sqlu and requested @SQLChicken to pull strings for you sqljoe #youmightbeaDBA if you have tried to TiVo @MidnightDBA broadcasts sqljoe #youmightbeaDBA if your #sql user group feels like #AA meetings sqljoe #youmightbeaDBA if you thought of bringing your #sql books to #sqlsaturday and #sqlpass for autographs sqljoe #youmightbeaDBA if #sqlpass feels like the #oscars sqljoe #youmightbeaDBA if you are proud of your small package SQLLawman #youmightbeaDBA when you hear MDX and Acura is not first thought that comes to mind. sqlrunner If your wife double checks that there isn't a SQLSat within 200 miles of your vacation destination #youmightbeaDBA sqlrunner When you're on a conference call and your wife thinks your speaking in a foreign language #youmightbeaDBA sqlrunner When you're on a conference call and your wife thinks your speaking in a foreign language #youmightbeaDBA sqlrunner You treat the word 'access' as a verb, not a noun #youmightbeaDBA sqlrunner If you are happy with sub-second performance #youmightbeaDBA sqlrunner When you know the names of the NOC people AND their families #youmightbeadba sqlrunner When you know the names of the NOC people AND their families #youmightbeadba sqlrunner Your company set's up international phone coverage for your cruise #youmightbeaDBA sqlsamson @buckwoody if your manager asks you for data and you respond with "there's a script for that" #youmightbeadba sqlsamson @buckwoody If you receive more messages from your server then your spouse #youmightbeadba SQLSoldier You've spent all night Valentines Day upgrading the SQL Servers and forgot to tell your wife you'd be working late. #youmightbeadba SQLSoldier You're flattered when someone calls you a geek. #youmightbeadba SQLSoldier @llangit @mrdenny it's 11pm on a holiday weekend, & your reading stupid jokes on Twitter then #youmightbeadba SQLSoldier Your manager borrows lunch money from you because your salary is 30% higher than his. #youmightbeaDBA SQLSoldier You think "intellisense" is a double negative because it's not intelligent nor makes sense. #youmightbeaDBA SQLSoldier 75% of the emails you receive at home have the phrase "now following you on Twitter!" in the subject line. #youmightbeaDBA SQLSoldier You petition Ken Burns to remake Office Space because it should have been 18 hours long. #youmightbeaDBA SQLSoldier You select a candidate for a Jr DBA position because his resume said he's willing to get your coffee. #youmightbeaDBA SQLSoldier Somebody misquotes @PaulRandall and you call him on your cell to verify. #youmightbeaDBA SQLSoldier You wish the elevator in your building was slower because it's the last time you'll be left alone all day. #youmightbeaDBA SQLSoldier The developers sacrifice small animals before giving you their code for review. #youmightbeaDBA SQLSoldier Developers bring you coffee and a BLT when you review their code. #youmightbeaDBA #IWish SQLSoldier You can get out of any family get-together by saying you have to work and nobody questions it. #youmightbeaDBA SQLSoldier You've requested a HP Superdome for you "test" box. #youmightbeaDBA SQLSoldier Your leave work early because your internet connection to the data center is better at home #youmightbeaDBA SQLSoldier The new CEO asks you to justify your salary, so you go on vacation for 2 weeks. And he never questions you again. #youmightbeaDBA SQLSoldier You cheer when Milton burns down the company in Office Space #youmightbeaDBA SQLSoldier A dev. asks if you've heard about some great new feature in SQL and you show the 16 blog posts you wrote on it ... last year #youmightbeaDBA SQLSoldier Your dev team is still testing SQL 2008 and you're already planning for SQL 11. #youmightbeaDBA #TrueStory SQLSoldier The new CEO asks you to justify your salary, so you go on vacation for 2 weeks. And he never questions you again. #youmightbeaDBA SQLSoldier Your dev team is still testing SQL 2008 and you're already planning for SQL 11. #youmightbeaDBA SQLSoldier You use a cell phone service coverage map to plan your next vacation. #youmightbeaDBA SQLSoldier You come in to work at 7 AM because it gives you at least 3 hours without any developers around. #youmightbeaDBA SQLSoldier You figure out a way to make take your wife on a cruise and deduct it as a business expense. #youmightbeaDBA #sqlcruise SQLSoldier You name your cat SQLDog because the name @SQLCat was already taken. #youmightbeaDBA SQLSoldier You rate your blog posts based on the number of retweets you get. #youmightbeaDBA SQLSoldier You disable random logins just to mess with people. #youmightbeaDBA SQLSoldier You fall for the pickup line, "Hey baby, what's your collation?" #youmightbeaDBA SQLSoldier You can blame an outage on anyone in the company because you're the only one that knows how to find out what really happened #youmightbeaDBA SQLSoldier You can blame an outage on anyone in the company because you're the only one that knows how to find out what really happened #youmightbeaDBA SQLSoldier You cheer when Milton burns down the company in Office Space #youmightbeaDBA SQLSoldier Your leave work early because your internet connection to the data center is better at home #youmightbeaDBA SQLSoldier You cheer when Milton burns down the company in Office Space #youmightbeaDBA SQLSoldier Your think the 4 food groups are coffee, bacon, fast food, and Mountain Dew. #youmightbeaDBA SQLSoldier You tell someone your job title and they ask "What?" You describe it and they ask "What?". So you say "computer geek". #youmightbeaDBA SQLSoldier The #1 referrer to your blog is Twitter.com. #youmightbeaDBA SQLSoldier Your idea of a good time on a Saturday involves free training. #youmightbeaDBA #sqlsat43 SQLSoldier You write a book that all of your co-workers have and none have read it. #youmightbeaDBA SQLSoldier You write a book that sells a couple thousand copies and is heralded a best seller. #youmightbeaDBA SQLSoldier No matter how sick you are, you go to work if it's time to pass the pager on to the next guy. #youmightbeaDBA #TrueStory SQLSoldier You go out on the town, and strangers walk up to you and say, "Hey you're that SQL guy" #youmightbeaDBA #TrueStory SQLSoldier Your wife asks you to fix something, and you request a downtime window. #youmightbeaDBA SQLSoldier Your wife asks when you'll be home, and you tell her that you wish you knew. #youmightbeaDBA SQLSoldier Your best pickup line, "Hey baby, what's your collation?" #youmightbeaDBA SQLSoldier Your wife asks when you'll be home, and you tell her that you wish you knew. #youmightbeaDBA SQLSoldier You know that @BuckWoody is not someone's porno name. #youmightbeaDBA SQLSoldier You list TSQL as your native language on the 2010 census. #youmightbeaDBA SQLSoldier Starbucks' stock price drops every time you go on vacation. #youmightbeaDBA SQLSoldier You're happy when the web master says that the website is down. #youmightbeaDBA SQLSoldier You know that @BuckWoody is not someone's porno name. #youmightbeaDBA SQLSoldier You get mad when someone calls your car a "heap" because you've always considered it to be a "clustered index". #youmightbeaDBA SQLSoldier Your blog has more hits than your company's website. #youmightbeaDBA SQLSoldier You systematically remove the asterisk key from all keyboards in the company except yours. #youmightbeaDBA SQLSoldier When asked if you recycle, you reply that you run sp_cycle_errorlog every night at midnight #youmightbeaDBA SQLSoldier You wouldn't allow someone named @AdamMachanic to work on your car. #youmightbeaDBA SQLSoldier You switch offices every 3 days to avoid developers #youmightbeaDBA SQLSoldier PSS has your number on speed dial. #youmightbeaDBA SQLSoldier You frown when you they tell Neo that he's going to the Oracle #youmightbeaDBA swhaley you regretted saying "This shouldn't effect production" #youmightbeaDBA swhaley you regretted saying "This shouldn't effect production" #youmightbeaDBA Tarwn A pleasurable saturday means spending the day learning more about what you already do the rest of the week #youmightbeaDBA ...oh, wait... thelostforum For great justice; all our base are belong to YOU !! #youmightbeadba thelostforum @SQLSoldier: You need a witness to use a mirror #youmightbeaDBA ;) TimCost you capitalize key words. always. everywhere. you can't help it, usually don't even notice. #youmightbeaDBA Toshana Your the only one in your company not impressed with the developers new application. #youmightbeaDBA venzann Coming soon from a (respected) book publisher - @buckwoody's #youmightbeaDBA venzann He's on a role tonight. @buckwoody is summing up my life with his #youmightbeaDBA tweets... venzann I love the #youmightbeaDBA tag. Found at least 6 new DBAs to follow.. venzann He's on a role tonight. @buckwoody is summing up my life with his #youmightbeaDBA tweets... venzann You use #sqlhelp as a primary resource during troubleshooting #youmightbeaDBA venzann You insist on stricter password security for your sql servers than you implement on your own laptop #youmightbeaDBA WesBrownSQL @buckwoody you are up so late the only tweets you see are from @buckwoody #youmightbeaDBA WesBrownSQL @SQLSoldier you are upgrading all your 2005 prod servers to 2008 R2 on a three day weekend... #youmightbeaDBA zippy1981 #youmightbeaDBA if everytime you do something with #mongodb you think of the Vulcan proverb "only Nixon could go to China." Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
How to Use Images as Navigation with innerfade Slideshow?

- by Katie

I am very new to JavaScript and only have the most basic understanding of how it works, so please bear with me. :) I'm using the jquery.innerfade.js script to create a slideshow with fade transitions for a website I'm developing, and I have added navigation buttons (which are set as background-images) that navigate between the “slides”. The navigation buttons have three states: default/off, hover, and on (each state is a separate image). I created a separate JavaScript document to set the buttons to “on” when they are clicked. The “hover” state is achieved through the CSS. Both the slideshow and the navigation buttons work well. There is just one thing I want to add: I would like the appropriate navigation button to display as “on” while the related “slide” is “playing”. Here's the HTML: <div id="mainFeature"> <ul id="theFeature"> <li id="the1feature"><a href="#" name="#promo1"><img src="_images/carousel/promo1.jpg" /></a></li> <li id="the2feature"><a href="#" name="#promo2"><img src="_images/carousel/promo2.jpg" /></a></li> <li id="the3feature"><a href="#" name="#promo3"><img src="_images/carousel/promo3.jpg" /></a></li> </ul> <div id="promonav-con"> <div id="primarypromonav"> <ul class="links"> <li id="the1title" class="promotop"><a rel="1" href="#promo1" class="promo1" id="promo1" onMouseDown="promo1on()"><strong>Botox Cosmetic</strong></a></li> <li id="the2title" class="promotop"><a rel="2" href="#promo2" class="promo2" id="promo2" onMouseDown="promo2on()"><strong>Promo 2</strong></a></li> <li id="the3title" class="promotop"><a rel="3" href="#promo3" class="promo3" id="promo3" onMouseDown="promo3on()"><strong>Promo 3</strong></a></li> </ul> </div> </div> And here is the jquery.innerfade.js, with my changes: (function($) { $.fn.innerfade = function(options) { return this.each(function() { $.innerfade(this, options); }); }; $.innerfade = function(container, options) { var settings = { 'speed': 'normal', 'timeout': 2000, 'containerheight': 'auto', 'runningclass': 'innerfade', 'children': null }; if (options) $.extend(settings, options); if (settings.children === null) var elements = $(container).children(); else var elements = $(container).children(settings.children); if (elements.length > 1) { $(container).css('position', 'relative').css('height', settings.containerheight).addClass(settings.runningclass); for (var i = 0; i < elements.length; i++) { $(elements[i]).css('z-index', String(elements.length-i)).css('position', 'absolute').hide(); }; this.ifchanger = setTimeout(function() { $.innerfade.next(elements, settings, 1, 0); }, settings.timeout); $(elements[0]).show(); } }; $.innerfade.next = function(elements, settings, current, last) { $(elements[last]).fadeOut(settings.speed); $(elements[current]).fadeIn(settings.speed, function() { removeFilter($(this)[0]); }); if ((current + 1) < elements.length) { current = current + 1; last = current - 1; } else { current = 0; last = elements.length - 1; } this.ifchanger = setTimeout((function() { $.innerfade.next(elements, settings, current, last); }), settings.timeout); }; })(jQuery); // **** remove Opacity-Filter in ie **** function removeFilter(element) { if(element.style.removeAttribute){ element.style.removeAttribute('filter'); } } jQuery(document).ready(function() { jQuery('ul#theFeature').innerfade({ speed: 1000, timeout: 7000, containerheight: '291px' }); // jQuery('#mainFeature .links').children('li').children('a').attr('href', 'javascript:void(0);'); jQuery('#mainFeature .links').children('li').children('a').click(function() { clearTimeout(jQuery.innerfade.ifchanger); for(i=1;i<5;i++) { jQuery('#the'+i+'feature').css("display", "none"); //jQuery('#the'+i+'title').children('a').css("background-color","#226478"); } // if(the_widths[(jQuery(this).attr('rel')-1)]==960) { // jQuery("#vic").hide(); // } else { // jQuery("#vic").show(); // } // jQuery('#the'+(jQuery(this).attr('rel'))+'title').css("background-color", "#286a7f"); jQuery('#the'+(jQuery(this).attr('rel'))+'feature').css("display", "block"); clearTimeout(jQuery.innerfade.ifchanger); }); }); And the separate JavaScript that I created: function promo1on() {document.getElementById("promo1").className="promo1on"; document.getElementById("promo2").className="promo2"; document.getElementById("promo2").className="promo2"; } function promo2on() {document.getElementById("promo2").className="promo2on"; document.getElementById("promo1").className="promo1"; document.getElementById("promo3").className="promo3"; } function promo3on() {document.getElementById("promo3").className="promo3on"; document.getElementById("promo1").className="promo1"; document.getElementById("promo2").className="promo2"; } And, finally, the CSS: #mainFeature {float: left; width: 672px; height: 290px; margin: 0 0 9px 0; list-style: none;} #mainFeature li {list-style: none;} #mainFeature #theFeature {margin: 0; padding: 0; position: relative;} #mainFeature #theFeature li {position: absolute; top: 0; left: 0;} #promonav-con {width: 463px; height: 26px; padding: 0; margin: 0; position: absolute; z-index: 900; top: 407px; left: 283px;} #primarypromonav {padding: 0; margin: 0;} #mainFeature .links {padding: 0; margin: 0; list-style: none; position: relative; font-family: arial, verdana, sans-serif; width: 463px; height: 26px;} #mainFeature .links li.promotop {list-style: none; display: block; float: left; display: inline; margin: 0; padding: 0;} #mainFeature .links li a {display: block; float: left; display: inline; height: 26px; text-decoration: none; margin: 0; padding: 0; cursor: pointer;} #mainFeature .links li a strong {margin-left: -9999px;} #mainFeature .links li a.promo1 {background: url(../_images/carouselnav/promo1.gif); width: 155px;} #mainFeature .links li:hover a.promo1 {background: url(../_images/carouselnav/promo1_hover.gif); width: 155px;} #mainFeature .links li a.promo1:hover {background: url(../_images/carouselnav/promo1_hover.gif); width: 155px;} .promo1on {background: url(../_images/carouselnav/promo1_on.gif); width: 155px;} #mainFeature .links li a.promo2 {background: url(../_images/carouselnav/promo2.gif); width: 153px;} #mainFeature .links li:hover a.promo2 {background: url(../_images/carouselnav/promo2_hover.gif); width: 153px;} #mainFeature .links li a.promo2:hover {background: url(../_images/carouselnav/promo2_hover.gif); width: 153px;} .promo2on {background: url(../_images/carouselnav/promo2_on.gif); width: 153px;} #mainFeature .links li a.promo3 {background: url(../_images/carouselnav/promo3.gif); width: 155px;} #mainFeature .links li:hover a.promo3 {background: url(../_images/carouselnav/promo3_hover.gif); width: 155px;} #mainFeature .links li a.promo3:hover {background: url(../_images/carouselnav/promo3_hover.gif); width: 155px;} .promo3on {background: url(../_images/carouselnav/promo3_on.gif); width: 155px;} Hopefully this makes sense! Again, I'm very new to JavaScript/JQuery, so I apologize if this is a mess. I'm very grateful for any suggestions. Thanks!

Read the article
Is there a Math.atan2 substitute for j2ME? Blackberry development

- by Kai

I have a wide variety of locations stored in my persistent object that contain latitudes and longitudes in double(43.7389, 7.42577) format. I need to be able to grab the user's latitude and longitude and select all items within, say 1 mile. Walking distance. I have done this in PHP so I snagged my PHP code and transferred it to Java, where everything plugged in fine until I figured out J2ME doesn't support atan2(double, double). So, after some searching, I find a small snippet of code that is supposed to be a substitute for atan2. Here is the code: public double atan2(double y, double x) { double coeff_1 = Math.PI / 4d; double coeff_2 = 3d * coeff_1; double abs_y = Math.abs(y)+ 1e-10f; double r, angle; if (x >= 0d) { r = (x - abs_y) / (x + abs_y); angle = coeff_1; } else { r = (x + abs_y) / (abs_y - x); angle = coeff_2; } angle += (0.1963f * r * r - 0.9817f) * r; return y < 0.0f ? -angle : angle; } I am getting odd results from this. My min and max latitude and longitudes are coming back as incredibly low numbers that can't possibly be right. Like 0.003785746 when I am expecting something closer to the original lat and long values (43.7389, 7.42577). Since I am no master of advanced math, I don't really know what to look for here. Perhaps someone else may have an answer. Here is my complete code: package store_finder; import java.util.Vector; import javax.microedition.location.Criteria; import javax.microedition.location.Location; import javax.microedition.location.LocationException; import javax.microedition.location.LocationListener; import javax.microedition.location.LocationProvider; import javax.microedition.location.QualifiedCoordinates; import net.rim.blackberry.api.invoke.Invoke; import net.rim.blackberry.api.invoke.MapsArguments; import net.rim.device.api.system.Bitmap; import net.rim.device.api.system.Display; import net.rim.device.api.ui.Color; import net.rim.device.api.ui.Field; import net.rim.device.api.ui.Graphics; import net.rim.device.api.ui.Manager; import net.rim.device.api.ui.component.BitmapField; import net.rim.device.api.ui.component.RichTextField; import net.rim.device.api.ui.component.SeparatorField; import net.rim.device.api.ui.container.HorizontalFieldManager; import net.rim.device.api.ui.container.MainScreen; import net.rim.device.api.ui.container.VerticalFieldManager; public class nearBy extends MainScreen { private HorizontalFieldManager _top; private VerticalFieldManager _middle; private int horizontalOffset; private final static long animationTime = 300; private long animationStart = 0; private double latitude = 43.7389; private double longitude = 7.42577; private int _interval = -1; private double max_lat; private double min_lat; private double max_lon; private double min_lon; private double latitude_in_degrees; private double longitude_in_degrees; public nearBy() { super(); horizontalOffset = Display.getWidth(); _top = new HorizontalFieldManager(Manager.USE_ALL_WIDTH | Field.FIELD_HCENTER) { public void paint(Graphics gr) { Bitmap bg = Bitmap.getBitmapResource("bg.png"); gr.drawBitmap(0, 0, Display.getWidth(), Display.getHeight(), bg, 0, 0); subpaint(gr); } }; _middle = new VerticalFieldManager() { public void paint(Graphics graphics) { graphics.setBackgroundColor(0xFFFFFF); graphics.setColor(Color.BLACK); graphics.clear(); super.paint(graphics); } protected void sublayout(int maxWidth, int maxHeight) { int displayWidth = Display.getWidth(); int displayHeight = Display.getHeight(); super.sublayout( displayWidth, displayHeight); setExtent( displayWidth, displayHeight); } }; add(_top); add(_middle); Bitmap lol = Bitmap.getBitmapResource("logo.png"); BitmapField lolfield = new BitmapField(lol); _top.add(lolfield); Criteria cr= new Criteria(); cr.setCostAllowed(true); cr.setPreferredResponseTime(60); cr.setHorizontalAccuracy(5000); cr.setVerticalAccuracy(5000); cr.setAltitudeRequired(true); cr.isSpeedAndCourseRequired(); cr.isAddressInfoRequired(); try{ LocationProvider lp = LocationProvider.getInstance(cr); if( lp!=null ){ lp.setLocationListener(new LocationListenerImpl(), _interval, 1, 1); } } catch(LocationException le) { add(new RichTextField("Location exception "+le)); } //_middle.add(new RichTextField("this is a map " + Double.toString(latitude) + " " + Double.toString(longitude))); int lat = (int) (latitude * 100000); int lon = (int) (longitude * 100000); String document = "<location-document>" + "<location lon='" + lon + "' lat='" + lat + "' label='You are here' description='You' zoom='0' />" + "<location lon='742733' lat='4373930' label='Hotel de Paris' description='Hotel de Paris' address='Palace du Casino' postalCode='98000' phone='37798063000' zoom='0' />" + "</location-document>"; // Invoke.invokeApplication(Invoke.APP_TYPE_MAPS, new MapsArguments( MapsArguments.ARG_LOCATION_DOCUMENT, document)); _middle.add(new SeparatorField()); surroundingVenues(); _middle.add(new RichTextField("max lat: " + max_lat)); _middle.add(new RichTextField("min lat: " + min_lat)); _middle.add(new RichTextField("max lon: " + max_lon)); _middle.add(new RichTextField("min lon: " + min_lon)); } private void surroundingVenues() { double point_1_latitude_in_degrees = latitude; double point_1_longitude_in_degrees= longitude; // diagonal distance + error margin double distance_in_miles = (5 * 1.90359441) + 10; getCords (point_1_latitude_in_degrees, point_1_longitude_in_degrees, distance_in_miles, 45); double lat_limit_1 = latitude_in_degrees; double lon_limit_1 = longitude_in_degrees; getCords (point_1_latitude_in_degrees, point_1_longitude_in_degrees, distance_in_miles, 135); double lat_limit_2 = latitude_in_degrees; double lon_limit_2 = longitude_in_degrees; getCords (point_1_latitude_in_degrees, point_1_longitude_in_degrees, distance_in_miles, -135); double lat_limit_3 = latitude_in_degrees; double lon_limit_3 = longitude_in_degrees; getCords (point_1_latitude_in_degrees, point_1_longitude_in_degrees, distance_in_miles, -45); double lat_limit_4 = latitude_in_degrees; double lon_limit_4 = longitude_in_degrees; double mx1 = Math.max(lat_limit_1, lat_limit_2); double mx2 = Math.max(lat_limit_3, lat_limit_4); max_lat = Math.max(mx1, mx2); double mm1 = Math.min(lat_limit_1, lat_limit_2); double mm2 = Math.min(lat_limit_3, lat_limit_4); min_lat = Math.max(mm1, mm2); double mlon1 = Math.max(lon_limit_1, lon_limit_2); double mlon2 = Math.max(lon_limit_3, lon_limit_4); max_lon = Math.max(mlon1, mlon2); double minl1 = Math.min(lon_limit_1, lon_limit_2); double minl2 = Math.min(lon_limit_3, lon_limit_4); min_lon = Math.max(minl1, minl2); //$qry = "SELECT DISTINCT zip.zipcode, zip.latitude, zip.longitude, sg_stores.* FROM zip JOIN store_finder AS sg_stores ON sg_stores.zip=zip.zipcode WHERE zip.latitude<=$lat_limit_max AND zip.latitude>=$lat_limit_min AND zip.longitude<=$lon_limit_max AND zip.longitude>=$lon_limit_min"; } private void getCords(double point_1_latitude, double point_1_longitude, double distance, int degs) { double m_EquatorialRadiusInMeters = 6366564.86; double m_Flattening=0; double distance_in_meters = distance * 1609.344 ; double direction_in_radians = Math.toRadians( degs ); double eps = 0.000000000000005; double r = 1.0 - m_Flattening; double point_1_latitude_in_radians = Math.toRadians( point_1_latitude ); double point_1_longitude_in_radians = Math.toRadians( point_1_longitude ); double tangent_u = (r * Math.sin( point_1_latitude_in_radians ) ) / Math.cos( point_1_latitude_in_radians ); double sine_of_direction = Math.sin( direction_in_radians ); double cosine_of_direction = Math.cos( direction_in_radians ); double heading_from_point_2_to_point_1_in_radians = 0.0; if ( cosine_of_direction != 0.0 ) { heading_from_point_2_to_point_1_in_radians = atan2( tangent_u, cosine_of_direction ) * 2.0; } double cu = 1.0 / Math.sqrt( ( tangent_u * tangent_u ) + 1.0 ); double su = tangent_u * cu; double sa = cu * sine_of_direction; double c2a = ( (-sa) * sa ) + 1.0; double x= Math.sqrt( ( ( ( 1.0 /r /r ) - 1.0 ) * c2a ) + 1.0 ) + 1.0; x= (x- 2.0 ) / x; double c= 1.0 - x; c= ( ( (x * x) / 4.0 ) + 1.0 ) / c; double d= ( ( 0.375 * (x * x) ) -1.0 ) * x; tangent_u = distance_in_meters /r / m_EquatorialRadiusInMeters /c; double y= tangent_u; boolean exit_loop = false; double cosine_of_y = 0.0; double cz = 0.0; double e = 0.0; double term_1 = 0.0; double term_2 = 0.0; double term_3 = 0.0; double sine_of_y = 0.0; while( exit_loop != true ) { sine_of_y = Math.sin(y); cosine_of_y = Math.cos(y); cz = Math.cos( heading_from_point_2_to_point_1_in_radians + y); e = (cz * cz * 2.0 ) - 1.0; c = y; x = e * cosine_of_y; y = (e + e) - 1.0; term_1 = ( sine_of_y * sine_of_y * 4.0 ) - 3.0; term_2 = ( ( term_1 * y * cz * d) / 6.0 ) + x; term_3 = ( ( term_2 * d) / 4.0 ) -cz; y= ( term_3 * sine_of_y * d) + tangent_u; if ( Math.abs(y - c) > eps ) { exit_loop = false; } else { exit_loop = true; } } heading_from_point_2_to_point_1_in_radians = ( cu * cosine_of_y * cosine_of_direction ) - ( su * sine_of_y ); c = r * Math.sqrt( ( sa * sa ) + ( heading_from_point_2_to_point_1_in_radians * heading_from_point_2_to_point_1_in_radians ) ); d = ( su * cosine_of_y ) + ( cu * sine_of_y * cosine_of_direction ); double point_2_latitude_in_radians = atan2(d, c); c = ( cu * cosine_of_y ) - ( su * sine_of_y * cosine_of_direction ); x = atan2( sine_of_y * sine_of_direction, c); c = ( ( ( ( ( -3.0 * c2a ) + 4.0 ) * m_Flattening ) + 4.0 ) * c2a * m_Flattening ) / 16.0; d = ( ( ( (e * cosine_of_y * c) + cz ) * sine_of_y * c) + y) * sa; double point_2_longitude_in_radians = ( point_1_longitude_in_radians + x) - ( ( 1.0 - c) * d * m_Flattening ); heading_from_point_2_to_point_1_in_radians = atan2( sa, heading_from_point_2_to_point_1_in_radians ) + Math.PI; latitude_in_degrees = Math.toRadians( point_2_latitude_in_radians ); longitude_in_degrees = Math.toRadians( point_2_longitude_in_radians ); } public double atan2(double y, double x) { double coeff_1 = Math.PI / 4d; double coeff_2 = 3d * coeff_1; double abs_y = Math.abs(y)+ 1e-10f; double r, angle; if (x >= 0d) { r = (x - abs_y) / (x + abs_y); angle = coeff_1; } else { r = (x + abs_y) / (abs_y - x); angle = coeff_2; } angle += (0.1963f * r * r - 0.9817f) * r; return y < 0.0f ? -angle : angle; } private Vector fetchVenues(double max_lat, double min_lat, double max_lon, double min_lon) { return new Vector(); } private class LocationListenerImpl implements LocationListener { public void locationUpdated(LocationProvider provider, Location location) { if(location.isValid()) { nearBy.this.longitude = location.getQualifiedCoordinates().getLongitude(); nearBy.this.latitude = location.getQualifiedCoordinates().getLatitude(); //double altitude = location.getQualifiedCoordinates().getAltitude(); //float speed = location.getSpeed(); } } public void providerStateChanged(LocationProvider provider, int newState) { // MUST implement this. Should probably do something useful with it as well. } } } please excuse the mess. I have the user lat long hard coded since I do not have GPS functional yet. You can see the SQL query commented out to know how I plan on using the min and max lat and long values. Any help is appreciated. Thanks

Read the article
SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article
SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article
Threading across multiple files

- by Zach M.

My program is reading in files and using thread to compute the highest prime number, when I put a print statement into the getNum() function my numbers are printing out. However, it seems to just lag no matter how many threads I input. Each file has 1 million integers in it. Does anyone see something apparently wrong with my code? Basically the code is giving each thread 1000 integers to check before assigning a new thread. I am still a C noobie and am just learning the ropes of threading. My code is a mess right now because I have been switching things around constantly. #include <stdio.h> #include <stdlib.h> #include <time.h> #include <string.h> #include <pthread.h> #include <math.h> #include <semaphore.h> //Global variable declaration char *file1 = "primes1.txt"; char *file2 = "primes2.txt"; char *file3 = "primes3.txt"; char *file4 = "primes4.txt"; char *file5 = "primes5.txt"; char *file6 = "primes6.txt"; char *file7 = "primes7.txt"; char *file8 = "primes8.txt"; char *file9 = "primes9.txt"; char *file10 = "primes10.txt"; char **fn; //file name variable int numberOfThreads; int *highestPrime = NULL; int fileArrayNum = 0; int loop = 0; int currentFile = 0; sem_t semAccess; sem_t semAssign; int prime(int n)//check for prime number, return 1 for prime 0 for nonprime { int i; for(i = 2; i <= sqrt(n); i++) if(n % i == 0) return(0); return(1); } int getNum(FILE* file) { int number; char* tempS = malloc(20 *sizeof(char)); fgets(tempS, 20, file); tempS[strlen(tempS)-1] = '\0'; number = atoi(tempS); free(tempS);//free memory for later call return(number); } void* findPrimality(void *threadnum) //main thread function to find primes { int tNum = (int)threadnum; int checkNum; char *inUseFile = NULL; int x=1; FILE* file; while(currentFile < 10){ if(inUseFile == NULL){//inUseFIle being used to check if a file is still being read sem_wait(&semAccess);//critical section inUseFile = fn[currentFile]; sem_post(&semAssign); file = fopen(inUseFile, "r"); while(!feof(file)){ if(x % 1000 == 0 && tNum !=1){ //go for 1000 integers and then wait sem_wait(&semAssign); } checkNum = getNum(file); /* * * * * I think the issue is here * * * */ if(checkNum > highestPrime[tNum]){ if(prime(checkNum)){ highestPrime[tNum] = checkNum; } } x++; } fclose(file); inUseFile = NULL; } currentFile++; } } int main(int argc, char* argv[]) { if(argc != 2){ //checks for number of arguements being passed printf("To many ARGS\n"); return(-1); } else{//Sets thread cound to user input checking for correct number of threads numberOfThreads = atoi(argv[1]); if(numberOfThreads < 1 || numberOfThreads > 10){ printf("To many threads entered\n"); return(-1); } time_t preTime, postTime; //creating time variables int i; fn = malloc(10 * sizeof(char*)); //create file array and initialize fn[0] = file1; fn[1] = file2; fn[2] = file3; fn[3] = file4; fn[4] = file5; fn[5] = file6; fn[6] = file7; fn[7] = file8; fn[8] = file9; fn[9] = file10; sem_init(&semAccess, 0, 1); //initialize semaphores sem_init(&semAssign, 0, numberOfThreads); highestPrime = malloc(numberOfThreads * sizeof(int)); //create an array to store each threads highest number for(loop = 0; loop < numberOfThreads; loop++){//set initial values to 0 highestPrime[loop] = 0; } pthread_t calculationThread[numberOfThreads]; //thread to do the work preTime = time(NULL); //start the clock for(i = 0; i < numberOfThreads; i++){ pthread_create(&calculationThread[i], NULL, findPrimality, (void *)i); } for(i = 0; i < numberOfThreads; i++){ pthread_join(calculationThread[i], NULL); } for(i = 0; i < numberOfThreads; i++){ printf("this is a prime number: %d \n", highestPrime[i]); } postTime= time(NULL); printf("Wall time: %ld seconds\n", (long)(postTime - preTime)); } } Yes I am trying to find the highest number over all. So I have made some head way the last few hours, rescucturing the program as spudd said, currently I am getting a segmentation fault due to my use of structures, I am trying to save the largest individual primes in the struct while giving them the right indices. This is the revised code. So in short what the first thread is doing is creating all the threads and giving them access points to a very large integer array which they will go through and find prime numbers, I want to implement semaphores around the while loop so that while they are executing every 2000 lines or the end they update a global prime number. #include <stdio.h> #include <stdlib.h> #include <time.h> #include <string.h> #include <pthread.h> #include <math.h> #include <semaphore.h> //Global variable declaration char *file1 = "primes1.txt"; char *file2 = "primes2.txt"; char *file3 = "primes3.txt"; char *file4 = "primes4.txt"; char *file5 = "primes5.txt"; char *file6 = "primes6.txt"; char *file7 = "primes7.txt"; char *file8 = "primes8.txt"; char *file9 = "primes9.txt"; char *file10 = "primes10.txt"; int numberOfThreads; int entries[10000000]; int entryIndex = 0; int fileCount = 0; char** fileName; int largestPrimeNumber = 0; //Register functions int prime(int n); int getNum(FILE* file); void* findPrimality(void *threadNum); void* assign(void *num); typedef struct package{ int largestPrime; int startingIndex; int numberCount; }pack; //Beging main code block int main(int argc, char* argv[]) { if(argc != 2){ //checks for number of arguements being passed printf("To many threads!!\n"); return(-1); } else{ //Sets thread cound to user input checking for correct number of threads numberOfThreads = atoi(argv[1]); if(numberOfThreads < 1 || numberOfThreads > 10){ printf("To many threads entered\n"); return(-1); } int threadPointer[numberOfThreads]; //Pointer array to point to entries time_t preTime, postTime; //creating time variables int i; fileName = malloc(10 * sizeof(char*)); //create file array and initialize fileName[0] = file1; fileName[1] = file2; fileName[2] = file3; fileName[3] = file4; fileName[4] = file5; fileName[5] = file6; fileName[6] = file7; fileName[7] = file8; fileName[8] = file9; fileName[9] = file10; FILE* filereader; int currentNum; for(i = 0; i < 10; i++){ filereader = fopen(fileName[i], "r"); while(!feof(filereader)){ char* tempString = malloc(20 *sizeof(char)); fgets(tempString, 20, filereader); tempString[strlen(tempString)-1] = '\0'; entries[entryIndex] = atoi(tempString); entryIndex++; free(tempString); } } //sem_init(&semAccess, 0, 1); //initialize semaphores //sem_init(&semAssign, 0, numberOfThreads); time_t tPre, tPost; pthread_t coordinate; tPre = time(NULL); pthread_create(&coordinate, NULL, assign, (void**)numberOfThreads); pthread_join(coordinate, NULL); tPost = time(NULL); } } void* findPrime(void* pack_array) { pack* currentPack= pack_array; int lp = currentPack->largestPrime; int si = currentPack->startingIndex; int nc = currentPack->numberCount; int i; int j = 0; for(i = si; i < nc; i++){ while(j < 2000 || i == (nc-1)){ if(prime(entries[i])){ if(entries[i] > lp) lp = entries[i]; } j++; } } return (void*)currentPack; } void* assign(void* num) { int y = (int)num; int i; int count = 10000000/y; int finalCount = count + (10000000%y); int sIndex = 0; pack pack_array[(int)num]; pthread_t workers[numberOfThreads]; //thread to do the workers for(i = 0; i < y; i++){ if(i == (y-1)){ pack_array[i].largestPrime = 0; pack_array[i].startingIndex = sIndex; pack_array[i].numberCount = finalCount; } pack_array[i].largestPrime = 0; pack_array[i].startingIndex = sIndex; pack_array[i].numberCount = count; pthread_create(&workers[i], NULL, findPrime, (void *)&pack_array[i]); sIndex += count; } for(i = 0; i< y; i++) pthread_join(workers[i], NULL); } //Functions int prime(int n)//check for prime number, return 1 for prime 0 for nonprime { int i; for(i = 2; i <= sqrt(n); i++) if(n % i == 0) return(0); return(1); }

Read the article

Search Results

Search found 3920 results on 157 pages for 'dave mess'.

Page 157/157 | < Previous Page | 153 154 155 156 157

- by arungupta

- by Rick Strahl

- by Paul White

- by Ben Rees

- by Dave_Stott

- by Jean

- by Paul Knopf

- by Davey

- by Richard Knop

- by Michael Williamson

- by Ricardo Ferreira

- by Alan Smith

- by BarsMonster

- by Ralf Westphal

- by BuckWoody

- by Katie

- by Kai

- by Nathan Spears

- by Nathan Spears

- by Zach M.

< Previous Page | 153 154 155 156 157