Search Results

Search found 28590 results on 1144 pages for 'best'.

Page 956/1144 | < Previous Page | 952 953 954 955 956 957 958 959 960 961 962 963 | Next Page >

Can you see something wrong in my working .htaccess?

- by AlexV

OK, after many search, trial and errors I've managed to create an .htaccess that do what I wanted (see explanations and questions after the code block): <IfModule mod_rewrite.c> RewriteEngine On #1 If the requested file is not url-mapper.php (to avoid .htaccess loop) RewriteCond %{REQUEST_FILENAME} (?<!url-mapper\.php)$ #2 If the requested URI does not end with an extension OR if the URI ends with .php* RewriteCond %{REQUEST_URI} !\.(.*) [OR] RewriteCond %{REQUEST_URI} \.php.*$ [NC] #3 If the requested URI is not in an excluded location RewriteCond %{REQUEST_URI} !^/seo-urls\/(excluded1|excluded2)(/.*)?$ #Then serve the URI via the mapper RewriteRule .* /seo-urls/url-mapper.php?uri=%{REQUEST_URI} [L,QSA] </IfModule> This is what the .htaccess should do: #1 is checking that the file requested is not url-mapper.php (to avoid infinite redirect loops). This file will always be at the root of the domain. #2 the .htaccess must only catch URLs that don't end with an extension (www.foo.com -- catch | www.foo.com/catch-me -- catch | www.foo.com/dont-catch.me -- don't catch) and URLs ending with .php* files (.php, .php4, .php5, .php123...). #3 some directories (and childs) can be excluded from the .htaccess (in this case /seo-urls/excluded1 and /seo-urls/excluded2). Finally the .htaccess feed the mapper with an hidden GET parameter named uri containing the requested uri. Even if I tested and everything works, I want to know if what I do is correct (and if it's the "best" way to do it). I've learned a lot with this "project" but I still consider myself a beginner at .htaccess and regular expressions so I want to triple check it there before putting it in production...

Read the article
How are large companies handling the storing and cataloging of software installation disks?

- by CT

I just started working in the IT department of a small-medium sized construction company with about 200 users. One of my responsibilities is to setup and configure all new machines that come in. I would like suggestions on how to best manage the installation disks and licenses of the software that comes with them. Plus any additional licensed software such as Autocad, Photoshop, etc as well as peripheral driver disks such as printers and scanners. Right now every machine is associated and labeled with an asset id. All asset ids are kept in a spreadsheet with applicable serial numbers, current user, warranty info, and software licenses. The physical disks are then kept within a folder in a cabinet. Each folder is marked with the asset id number as well as the current user. My problems with this is that the system was not maintained very completely before I came to the company. There are plenty of software folders with no asset ids labeled on them. Plenty of missing software folders (most likely are a lot of the unlabeled folders). Folders with names but not asset ids. Machines get switched to different users without the folders and spreadsheet being updated. I am not saying this method would necessarily be bad if it was better implemented and managed, but if I am going to have to take a lot of time to fix this system currently in place. I thought I would ask the community first on how others manage this process in case there are easier, more efficient ways of doing so. Thank you.

Read the article
Hosting options for data-enabled web application

- by Hertfordian

I am independently developing an asp.net business application with a MySQL database. I currently have a Windows web hosting account which includes MySQL and MS SQL as installed supported options. I am not yet finally committed to using MySQL and I want to keep my options open to evaluate MS SQL and possibly other options such as PostGreSQL later when more of the business logic is in place - my data access layer will handle the database connectivity. The web hosting setup I have now is fine for development purposes, but if in future I want to use, say, PostGreSQL Server, and a level of usage of, say, 10,000 hits per day concentrated in business hours, I'm assuming I'll need a dedicated server. But in that case, should I just install PostGreSQL on the dedicated server, or is best practice to have a separate database server - perhaps locked down so that it can only be accessed through the web server? And supposing it was only 2000 hits a day - how would that change things? I'd appreciate it if anyone could point me in the direction of a useful guide to these sorts of issues. Naturally if I start paying for separate servers, I would like to know exactly why I'm doing it and what the performance issues and thresholds are.

Read the article
Very high Magento/Apache memory usage even without visitors (are we fooled by our hosting company?)

- by MrDobalina

I am no server guy and we have issues with our speed so I come here asking for advise. We have a VPS with 2 cores and 2gb of RAM at a Magento specialized hosting company. Over the course of the last weeks our site speed has gotten worse, even though our store is new, has less than 1000 SKUs and not even 100 visitos a day. At magespeedtest.com we only get 1.87 trans/sec @ 2.11 secs each with a mere 5 concurrent users. Our magento log files are clean, we have no huge database tables or anything like that. When we take a look at our server real time stats, we see that the memory usage jumped up from about 34% to 71% and now 82% in just a few days in idle, with no visitors on the site. Our hosting company said that we do not need to worry about that as it`s maybe related to mysql which creates buffers (which are maybe not even actually being used) and what is important is CPU and swap - stats are ok here. They also said that the low benchmark scores are caused by bad extensions or template modifications on our side. We are not sure if we can trust that statement as we only have 4 plugins installed (all from aheadworks and amasty which are known to be one of the best magento extension developers). Our template modifications are purely html and css, no modifications to the php code. Our pagespeed is ranked with 93/100 in firebug and Magento is properly configured, so the problem really just gets obvious when there are a handful of users on the site at the same time. Can anyone confirm our hosting`s statement about memory usage and where can I start looking for a solution?

Read the article
Downloading Emails locally with Thunderbird

- by r_honey

I am using Gmail (web interface) for sometime now, and have well over 20 labels and some thousand mails there archived to different labels in Gmail. Now I want to have a local copy of all my mails and following points are important in the context: The Primary mail access mechanism would continue to be Gmail web for me. I just want a backup of my mail account locally. Ideally the mails should download locally in folders named after Gmail labels (I know this is possible via IMAP but probably not by POP) After all my mails are available locally, I will delete most of them in Gmail to free up space and because I want to archive them. The mails should continue to exist locally and should not be deleted when I delete when from Gmail web interface. I would be syncing my gmail account locally let's say every month. So, the new mails that I have sent/received during this period should come over to my local mailbox in the folders named after Gmail labels. I do understand that Gmail maintains a single copy of email having 2 different labels and such email would get duplicated locally in the 2 folders and I am okay with that. Essentially you can see I just want to archive my mails from the Gmail server to a local backup and then sync (one way from Gmail to locally) new mails at regular intervals. For some points above, POP seems to be the option while IMAP seems for the others. I am really confused and need help in deciding which of POP or IMAP would suit me best. I have currently chosen Thunderbird to be my local email client but would not have a problem switching to Outlook or anything else as long as I get my desired archiving functionality.

Read the article
Old network login passed to IIS

- by 300 baud

Let me start by saying that I am not a server guy - I am a developer. But I develop and manage an ASP.NET application that uses Windows authentication. I've run into the problem I am about to describe before, and I would just like to understand how to remedy it since I am the one who always gets the original support request. A user, let's call her JaneDoe, has just gotten married and her login has been changed to JaneJones. We have an application that uses Windows authentication to store the user's login name to a table and then redirects the user to another non-Windows authenticated site with a GUID which points to the table entry we just made. When the user reaches the second site, we read in the login name from the database using the GUID that was passed. Then, we look up the login name in another database where we track application permissions. The problem is that the user is logging into her workstation as JaneJones, but the Windows authenticated site is still receiving a login name of JaneDoe. Is this a domain controller issue? Is it a workstation issue? What's the best way to resolve this?

Read the article
What could cause random files being uploaded without permission?

- by Dustin

I have been having issues lately with a certain directory. It seems someone is placing files into it, or something of that sort, and any attempt to delete them is successful, HOWEVER they reappear over time (maybe not the exact same ones, but random files). I will provide you the information I can and several pictures of my problem: sandbox.mys4l.com/visual/files/b1.jpg Files like this have been appearing in my /visual/ folder, and I have no clue where they are coming from. sandbox.mys4l.com/visual/files/b2.jpg This is what is inside on of those weird files, it appears to be nothing problematic. sandbox.mys4l.com/visual/files/b4.jpg As you can see, in the time it took me to take the first picture, more odd files showed up. These log files are also being uploaded to this directory, and I know I didn't put them there. sandbox.mys4l.com/visual/files/b7.jpg This inside one of these mysterious .log files, I'm not sure what it's all about. These files only appear to be going into this specific area, and I'm not sure of their origin, only that they will not go away. I have done a full system scan at least twice with an up-to-date virus scan, and have looked for an unknown script which may be writing them there. Nothing has come up, so I come to you guys as I hear this is the best place to find answers. Hope this problem has a solution!

Read the article
Linux servers in a (primarily) Windows (AD) environment

- by HannesFostie

When I arrived at my current position, our environment existed almost exclusively of Windows servers. However, I am a big fan of using Linux for certain applications, like the webgallery I was asked to set up, a simple SFTP server, Nagios for monitoring etc. I do fine setting these up, but not being the Linux expert, I am not sure how to properly join these servers to the domain and was therefor wondering what procedures or guidelines other people follow. We often use ping -a to quickly figure out the hostname of a certain server, but this does not seem to work for the linux machines, most likely because of the whole WINS/NetBios thing I assume. I just joined one server to the domain, but probably missed something cause it's not working even after a dnsflush. Next to that, the couple procedures I've found so far are pretty extensive and most of the time don't seem worth the time. Best case scenario, I download some kind of client (smbclient?), enter the domain name and maybe the server to use, supply an administrator password and that's it. Is that possible at all? Thanks

Read the article
Centos 5.5 install PearDB

- by John Gardeniers

Disclaimer: I use Linux for some jobs but I am not a Linux admin. I have a Centos 5.4 machine which performs some server duties and doubles as a web site development machine. PHP 5.3.3 was installed from RPM with the --without-pear option. I now wish to use PearDB but can't figure out how to install it. If I run yum install php-pear-db, it comes back with Error: Missing Dependency: php = 5.1.6-27.el5_5.3 is needed by package php-devel-5.1.6-27.el5_5.3.i386 (updates). The only RPM I've found that looks like it might be close currently has a dead link, so I can't even try that. What would be the best way to go about this? Is there a way to reinstall from the RPM and include pear? Can I install the dependency without breaking the current installation? Should I try to uninstall the original PHP and reinstall it from source, complete with pear? I thought this might have been an SU question but the FAQ over there suggests otherwise.

Read the article
Sending 10,000 emails?

- by Philipp Lenssen

(I don't know whether to submit this to SuperUser or StackOverflow.) My friend and I have a subscriber list -- strictly opt-in, for people interested when we release new projects -- and we may want to send out a first email this year. How would one best send an email to over 10,000 people? Perhaps there is a good, I suppose paid service, which would do this for one? The key being that the emails ought arrive, with good chances of not being immediately labeled as spam due to the sender or something (as it's definitely not spam). When I simply send emails from my 1&1 server, they often immediately make it into the spam folder, even though when sending the same email through say Google Appspot, it arrives fine. FWIW, the email addresses are not yet verified (people simply signed up for the list by providing an email, and optionally a name). Analyzing what bounced may be a pro. BTW, we will provide an opt-out link with every mail. Thanks for any pointers, and sorry if this doesn't fit to ServerFault 100%!

Read the article
Computer is dying--what should I be looking for?

- by Will

Okay, I'm a bit knowledgeable with pooters and such, but i'm confused. My computer is dying slowly, and I'm not sure what part is causing this. Computer details: Vista, dell machine, intel Q6600, 2.4 Core Duo (quad core), standard memory and drive (unknown manufacturer). Symptoms: I would best describe the symptoms as memory corruption. After a couple days on, I start getting applications crashing or failing to open for a lack of "resources". Sounds are corrupted. Onscreen text gets corrupted; the characters of text are garbled, not the pixels on the screen. Video memory seems untouched as I haven't seen any misplaced pixels. Recently I've lost files on disk. I've also experienced errors reporting a supposed lack of disk space, even though I have fifty gigs free. There was one point where I couldn't get to the POST when booting up. After I cleaned everything (see next) this hasn't happened. Diagnostic steps: First thing I did was clean the case. There was a lot of dust buildup on heatsinks, so I cleaned all that up. No help. Next, I disconnected and reconnected everything, from power cables to memory (did not reseat cpu). No change. Last, I ran the standard vista memory diagnostics and ran checkdisk. Both reported no errors found. I have not run any POST tests, now that I think about it. I'm at a loss at this point. Disk appears fine, memory too. I'd expect motherboard issues to result in the thing not booting up, yet it does every time. What should I be looking at? What more can I do?

Read the article
NOTEPAD++ Need macro or typeitin for automation of large lists

- by user2526699

I'm sure there is a way to do this but I can not seem to figure it out. I will try my best to explain this. I have a list with 20,000 lines in notepad++. I have two tabs open in notepad++. The right side tab is the main list. The left side tab is what needs to be added to the beginning of each line in the right tab. Here is an image of my notepad++ to give you a better understanding. I need to be able to do the following in an automated way as I have over 20,000 lines to do this way. copy line 1 of tab 'new 7' switch to tab 'new 6' paste clipboard(line 1 of tab 'new 7') at beginning of line 1 tab 'new 6' switch back to tab 'new 7' copy line 2 of tab 'new 7' switch to tab 'new 6' paste clipboard(line 2 of tab 'new 7') at beginning of line 2 tab 'new 6' I have both pasteitin and typeitin download but if i need some other program/app or if it's built in to notepad++ that would be great. I need to do this by the program itself or for me to only have to press a button to do each of these.

Read the article
Monitoring Between EC2 Regions

- by ABrown

I'm working on a small EC2 project that involves a handful of servers in two different regions (US East and EU West). My first task is to implement a Nagios monitoring solution. Monitoring within a region is simple - I just use the private domain names/IPs, but I'm a little unsure of the best way to handle monitoring the second region without setting up a second Nagios install. The environment is fairly static, so I'm not going to be scripting the configuration with the EC2 tools just yet. As I see it, I have two options. Two Nagios installations (which is over-kill for the small number of servers I'm dealing with). Pros: I don't have to alter the group permissions nor do I have to pay for the traffic, redundancy in the monitoring solution - I could monitor the Nagios servers. Cons: two installations to deal with and I'd need to run another server instance. Have the single installation monitor both regions. Pros: one installation to deal with. Cons: slightly reduced security - security group will have to have NRPE (5666) opened for one source IP and also paying for a small amount of bandwidth at the Internet rate for data transfer between the regions. I guess my question is - how have others handled this problem and what are your recommendations? Thanks!

Read the article
Printer irregularly producing garbage output

- by John Gardeniers

Every now and then instead of getting the proper output we get numerous pages, mostly with just a single line, of output which appears to be the raw PCL. My theory is that this happens when the first byte or two of the document is somehow not received by the printer, which then doesn't know how to interpret the rest and does it's best by spitting it out as text. This is a problem I've seen many times over the years but has been popping up more often since we upgraded to Win 7 64 bit, which introduced a number of headaches because of the HP lack of real support for 64 bits. It also appears to happen most often when printing PDF files. We have tried several different PDF readers in addition to Adobe's own but that hasn't helped. While we mainly use HP printers, and the problem is not limited to any particular model, I've also seen it happen on other brands, albeit to a lesser extent. I've also been unable to discern a difference between printers used via a print server or those connected directly by IP address. It also happens to USB attached printers. Because of the erratic nature of this problem there is precious little I can think of to try and debug it, so I'm after any ideas that might help to eliminate it.

Read the article
Linux RHEL : Making disk image efficiently

- by TheProfoundGeek

I have a linux box having RHEL. Its disk (hda1) is having free space of about 25GB. I have an another disk (hda2) which is of 250GB having another RHEL instance, it's partitioned for 200GB. Data on the disk occupies about 21GB of data. The image of hda2 needs to be taken and restored on other disk of same specs. What is the best way to make image file of the hda2? Ideally the images size should be around 25GBs as the actual data on the disk is just 21GB. I am aware about the following two methods. Method 1 : Raw Image dd if=/dev/hda2 of=/path/to/image dd if=/path/to/image of=/dev/hda3 Question 1 : Will the above method make a gigantic image of 250GBs? Is it efficient? Method 2 : Compressed Image. dd if=/dev/hda2 | gzip > /path/to/image.gz gzip -dc /path/to/image.gz | dd of=/dev/hda2 Question 2 : I tried the method 2, its taking too long. What are the pit falls of this methods? Which of the above method id efficient and why? Is there any other Linux utility which can do the job? Third party tools are no no.

Read the article
Deploying and publishing my first asp.net mvc 3 web application

- by john G

I want to deploy and publish my first asp.net mvc 3 web application at my client side (the client is a small office with 2-4 employees that need to access the application) , currently i finished developing my web application using the free Microsoft Visual Web Developer 2010 express and the free SQL Server 2008 R2 Express database . So my concerns are :- 1. The free sql express database that i am currently using have a limitation of 10 GB i think So i want to buy SQL server for small business to remove the database limitations. So my questions that i need help with them are:- **1. If i use the sql server for small business then will my web application have other limitations on production i am unaware of ? 2. Will be using SQL server for small business my right choice? baring in my that the system will be used by 2-4 clients only? 3. How much does “approximate ” the sql server database will cost in US dollars? 4. Are there any other software that i need to buy to be able to deploy and publish the application on intranet and the internet ?** Appreciate any help Best Regards

Read the article
Iptables state tracking

- by complexgeek

Hi there. I've just taken over administration of a fairly complex firewall ruleset for a firewall box running Fedora Core 12, and there's one thing about it that is puzzling me. When I run nmap on the gateway from outside the network, I see all the expected services, but also sunrpc on port 111. The INPUT chain has DEFAULT DROP set, and there is no rule allowing port 111. As best I can tell (watching the packet counters before/during/after the scan) it's being allowed by the rule: "-m state --state RELATED,ESTABLISHED -j ACCEPT" but I don't understand why a brand new TCP connection would be considered RELATED or ESTABLISHED. Any suggestions would be greatly appreciated. EDIT: Conntrack modules: nf_conntrack_netlink 14925 0 nfnetlink 3479 1 nf_conntrack_netlink nf_conntrack_irc 5206 1 nf_nat_irc nf_conntrack_proto_udplite 3138 0 nf_conntrack_h323 62110 1 nf_nat_h323 nf_conntrack_proto_dccp 6878 0 nf_conntrack_sip 16921 1 nf_nat_sip nf_conntrack_proto_sctp 11131 0 nf_conntrack_pptp 10673 1 nf_nat_pptp nf_conntrack_sane 5458 0 nf_conntrack_proto_gre 6574 1 nf_conntrack_pptp nf_conntrack_amanda 2796 1 nf_nat_amanda nf_conntrack_ftp 11741 1 nf_nat_ftp nf_conntrack_tftp 4665 1 nf_nat_tftp nf_conntrack_netbios_ns 1534 0 nf_conntrack_ipv6 18504 2 ipv6 279399 40 ip6t_REJECT,nf_conntrack_ipv6 INPUT chain on the filter table: -A INPUT -s 192.168.200.10/32 -p tcp -m tcp --dport 22 -m state --state NEW,ESTABLISHED -j ACCEPT -A INPUT -s 127.0.0.0/8 -i lo -j ACCEPT -A INPUT -p udp -m udp --sport 67:68 --dport 67:68 -j ACCEPT -A INPUT -d 192.168.200.5/32 -i eth0 -j ACCEPT -A INPUT -d 192.168.1.2/32 -i eth0 -j ACCEPT -A INPUT -d {public_ip}/32 -i ppp0 -p tcp -m multiport --dports 22,80,443 -j ACCEPT -A INPUT -d {public_ip}/32 -i ppp0 -p tcp -m multiport --sports 22,25,80,443 -j ACCEPT -A INPUT -d {public_ip}/32 -i ppp0 -p udp -m udp --dport 1194 -j ACCEPT -A INPUT -d {public_ip}/32 -i ppp0 -p udp -m udp --sport 1194 -j ACCEPT -A INPUT -d {public_ip}/32 -i ppp0 -p udp -m multiport --sports 53,123 -j ACCEPT -A INPUT -d {public_ip}/32 -i ppp0 -p icmp -m icmp --icmp-type 8 -j ACCEPT -A INPUT -i eth0 -m state --state NEW -j ACCEPT -A INPUT -d {public_ip}/32 -m state --state NEW -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT eth0 is connected to the internal network, eth3 is connected to an ADSL modem in bridge mode, ppp0 is the WAN connection tunneled over eth3.

Read the article
Network using only switches

- by mschultz

So I'm not a network guy - but here's what I want to do - I have an existing network using wifi, which I like, and which is used to connect several computers to the internet. It is headed up by a router, which is in another part of the building. Three of those computers are in my office. All three have gigabit wired ethernet. I have a gigabit switch. Here's what I want to do: Build a 2nd network, out of just that switch, which allows all 3 computers to connect to each other (just to each other is fine, for this purpose, they need no internet). I have a distributed computing task (rendering high-quality fractal artwork, as it were), that requires the best connection speed to all 3 computers. I want them to be able to "talk to each other" as quickly as possible, with the fewest dropped packets (the dataflow over this network will be quite high). So how do I do this. I'm not a networking guy at all - I tried connecting them all, and nobody got an IP address (which I assume is because nobody is running a DNS server?). What all do I need to do to make this work? PS - two are running windows, one is running ubuntu.

Read the article
Windows Server 2008 (x64): Wont boot past bios page

- by WebSolProv

Happy New Year, Since a month or so ago I inherited responsibility for small network administration for my sins. The domain controller (yes there is only one, and yes I know it is best practice to have two even in a small domain setup) went down overnight and I have been trying all day to get it back up and running. Unfortunately this machine also administers our entire ActiveDirectory setup: 1) It goes thru the BIOS without any errors, nothing whatsoever 2) It gets into the “select safe mode, safe mode with networking, normal” etc and if you select either of the safe mode options it loads a few files then reboots. If you select normal it just runs for a bit (doesn’t get to the windows splash screen) and then reboots again. 3) If you select windows repair, it asks for an image to repair too: however it would appear that none was taken that can be used (!!) or one is not being shown. 4) I have tried repairing the boot sector and the boot configuration using Bootrec.exe, both which it says were completed successfully but still it doesn’t work. 5) I have tried swapping the drives into another server to rule out hardware and that didn’t work either so clearly it’s the OS. 6) I have tried running chkdsk which ran fine, and also memory check which was also fine. We do have another machine on the network that was installed as a DC so when we decommission the current infrastructure but when I try and "promote" this to the lead DC then I get “you cannot modify domain or trust information because a PDC emulator cannot be contacted" so I am unable to replicare the ActiveDirectory details. If anyone can think of any direction I should follow it would be greatly appreciated, Thanks, Alex

Read the article
3 Server, is this a cluster scenario?

- by HornedBeast

Hello, At the moment I have one Ubuntu server, 9.10, running with a simple Samba share, a mail server, DNS server and DHCP server. Mostly its just there for file sharing and email server. I also have 2 other servers that are exactly the same hardware and spec as the first, which have an rsync set up to retrieve the shared folders and backs them up. However, if the first server goes down, all of our shares disappear along with our mail and the system must be rebuilt. Also I tend to find if people are downloading a large amount from the file server, no-one can access there emails - especially in the morning when everyone is signing in at once. Would it be more beneficial for me to have all 3 servers, all running the same services, doing the same thing with some sort of cluster with load balancing? In short, how can I get the best out of my 3 hardware servers? I'm not really sure where to begin looking, or how to go about such a setup where 3 servers are all identical, but perhaps one acts as the main load balancer?? If someone can point me in the right direction, or if this simply sounds like one of those Enterprise Cloud's that is now a default setup in Ubuntu Server 9.10+, then I'll go down that route. Cheers in advance. Andy

Read the article
Do email providers have to tell me which (inter)national agencies/institutes are requesting legal access to my account data?

- by Juve

I know this question is not technical, but i did not find the "stackoverflow for legal issues" and I guess all you super users out there might know the answer. Here is my (potential) problem: I have a free email account at a (inter)national email provider. I used the words "wikileaks" and "twitter" lately in my email. Some over-ambitious national security organization legally requests access to all accounts that behaved similarly. Q1: Can I request the who-, when-, and why-information related to this legal request from my provider? Does he have to tell me which (inter)national organizations (legally) requested my account data? Q2: Does the situation change if I live in Germany (and have a German provider)? I guess here are some German users. And I know that such a legal policy exists for our national credit rating agency. I can request who got access to my data, they have to tell me. Please answer only if you know a good answer, I don't want to start a long discussion on this none-technical question. Best regards, Juve

Read the article
Perfmon % Processor Time vs. task manager's CPU usage

- by nat

I'm new to using Perfmon and performance monitoring in general (so go easy on me please ;) I know that Perfmon doesn't have anything exactly like Task Manager's CPU usage display, but I'm trying to figure out how to monitor user's CPU usage via Perfmon in a similar way, and trying to understand the measurements (or how to convert the numbers to get a similar understanding) For example, if in Task Manager, a particular user is consistently using more than 5% CPU, I would want to contact the user about it. I learn best by example, so here is exactly what I'm trying to do, with a specific example: This is for a 32-bit Dual Quad Core Windows 2003 web server (8 CPUs), there are many web sites on the server, each running within their own application pool/worker process ID. Through other research here I learned of a registry change that I made so that the PID shows up with the w3wp process so I can easily identify the site later by cross-referencing it. I set up a counter with the following settings: Process -> % Processor Time -> all instances Here is an example. Say I'm interested in "black line" user in this graph below, as his process is spiking quite high compared to all the other users: (I wasn't allowed to post the image as I'm a new user on this site.. I've uploaded the image to:) http://i35.tinypic.com/106yn8k.jpg So... using this as an example, I see that they have an AVERAGE % PROCESSOR TIME of 23.264 , and have spiked as high as 103.124 So what exactly does this 23.264 number mean to me? Is it similar to an average of Task Manager's CPU reading for this user? Or, since this server has 8 CPUs, should I divide this number by 8? (23.264/8 = 2.9% AVERAGE CPU LOAD?) Thanks in advance.

Read the article
What are good design practices when working with Entity Framework

- by AD

This will apply mostly for an asp.net application where the data is not accessed via soa. Meaning that you get access to the objects loaded from the framework, not Transfer Objects, although some recommendation still apply. This is a community post, so please add to it as you see fit. Applies to: Entity Framework 1.0 shipped with Visual Studio 2008 sp1. Why pick EF in the first place? Considering it is a young technology with plenty of problems (see below), it may be a hard sell to get on the EF bandwagon for your project. However, it is the technology Microsoft is pushing (at the expense of Linq2Sql, which is a subset of EF). In addition, you may not be satisfied with NHibernate or other solutions out there. Whatever the reasons, there are people out there (including me) working with EF and life is not bad.make you think. EF and inheritance The first big subject is inheritance. EF does support mapping for inherited classes that are persisted in 2 ways: table per class and table the hierarchy. The modeling is easy and there are no programming issues with that part. (The following applies to table per class model as I don't have experience with table per hierarchy, which is, anyway, limited.) The real problem comes when you are trying to run queries that include one or many objects that are part of an inheritance tree: the generated sql is incredibly awful, takes a long time to get parsed by the EF and takes a long time to execute as well. This is a real show stopper. Enough that EF should probably not be used with inheritance or as little as possible. Here is an example of how bad it was. My EF model had ~30 classes, ~10 of which were part of an inheritance tree. On running a query to get one item from the Base class, something as simple as Base.Get(id), the generated SQL was over 50,000 characters. Then when you are trying to return some Associations, it degenerates even more, going as far as throwing SQL exceptions about not being able to query more than 256 tables at once. Ok, this is bad, EF concept is to allow you to create your object structure without (or with as little as possible) consideration on the actual database implementation of your table. It completely fails at this. So, recommendations? Avoid inheritance if you can, the performance will be so much better. Use it sparingly where you have to. In my opinion, this makes EF a glorified sql-generation tool for querying, but there are still advantages to using it. And ways to implement mechanism that are similar to inheritance. Bypassing inheritance with Interfaces First thing to know with trying to get some kind of inheritance going with EF is that you cannot assign a non-EF-modeled class a base class. Don't even try it, it will get overwritten by the modeler. So what to do? You can use interfaces to enforce that classes implement some functionality. For example here is a IEntity interface that allow you to define Associations between EF entities where you don't know at design time what the type of the entity would be. public enum EntityTypes{ Unknown = -1, Dog = 0, Cat } public interface IEntity { int EntityID { get; } string Name { get; } Type EntityType { get; } } public partial class Dog : IEntity { // implement EntityID and Name which could actually be fields // from your EF model Type EntityType{ get{ return EntityTypes.Dog; } } } Using this IEntity, you can then work with undefined associations in other classes // lets take a class that you defined in your model. // that class has a mapping to the columns: PetID, PetType public partial class Person { public IEntity GetPet() { return IEntityController.Get(PetID,PetType); } } which makes use of some extension functions: public class IEntityController { static public IEntity Get(int id, EntityTypes type) { switch (type) { case EntityTypes.Dog: return Dog.Get(id); case EntityTypes.Cat: return Cat.Get(id); default: throw new Exception("Invalid EntityType"); } } } Not as neat as having plain inheritance, particularly considering you have to store the PetType in an extra database field, but considering the performance gains, I would not look back. It also cannot model one-to-many, many-to-many relationship, but with creative uses of 'Union' it could be made to work. Finally, it creates the side effet of loading data in a property/function of the object, which you need to be careful about. Using a clear naming convention like GetXYZ() helps in that regards. Compiled Queries Entity Framework performance is not as good as direct database access with ADO (obviously) or Linq2SQL. There are ways to improve it however, one of which is compiling your queries. The performance of a compiled query is similar to Linq2Sql. What is a compiled query? It is simply a query for which you tell the framework to keep the parsed tree in memory so it doesn't need to be regenerated the next time you run it. So the next run, you will save the time it takes to parse the tree. Do not discount that as it is a very costly operation that gets even worse with more complex queries. There are 2 ways to compile a query: creating an ObjectQuery with EntitySQL and using CompiledQuery.Compile() function. (Note that by using an EntityDataSource in your page, you will in fact be using ObjectQuery with EntitySQL, so that gets compiled and cached). An aside here in case you don't know what EntitySQL is. It is a string-based way of writing queries against the EF. Here is an example: "select value dog from Entities.DogSet as dog where dog.ID = @ID". The syntax is pretty similar to SQL syntax. You can also do pretty complex object manipulation, which is well explained [here][1]. Ok, so here is how to do it using ObjectQuery< string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); The first time you run this query, the framework will generate the expression tree and keep it in memory. So the next time it gets executed, you will save on that costly step. In that example EnablePlanCaching = true, which is unnecessary since that is the default option. The other way to compile a query for later use is the CompiledQuery.Compile method. This uses a delegate: static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => ctx.DogSet.FirstOrDefault(it => it.ID == id)); or using linq static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => (from dog in ctx.DogSet where dog.ID == id select dog).FirstOrDefault()); to call the query: query_GetDog.Invoke( YourContext, id ); The advantage of CompiledQuery is that the syntax of your query is checked at compile time, where as EntitySQL is not. However, there are other consideration... Includes Lets say you want to have the data for the dog owner to be returned by the query to avoid making 2 calls to the database. Easy to do, right? EntitySQL string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)).Include("Owner"); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); CompiledQuery static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => (from dog in ctx.DogSet.Include("Owner") where dog.ID == id select dog).FirstOrDefault()); Now, what if you want to have the Include parametrized? What I mean is that you want to have a single Get() function that is called from different pages that care about different relationships for the dog. One cares about the Owner, another about his FavoriteFood, another about his FavotireToy and so on. Basicly, you want to tell the query which associations to load. It is easy to do with EntitySQL public Dog Get(int id, string include) { string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)) .IncludeMany(include); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); } The include simply uses the passed string. Easy enough. Note that it is possible to improve on the Include(string) function (that accepts only a single path) with an IncludeMany(string) that will let you pass a string of comma-separated associations to load. Look further in the extension section for this function. If we try to do it with CompiledQuery however, we run into numerous problems: The obvious static readonly Func<Entities, int, string, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, string, Dog>((ctx, id, include) => (from dog in ctx.DogSet.Include(include) where dog.ID == id select dog).FirstOrDefault()); will choke when called with: query_GetDog.Invoke( YourContext, id, "Owner,FavoriteFood" ); Because, as mentionned above, Include() only wants to see a single path in the string and here we are giving it 2: "Owner" and "FavoriteFood" (which is not to be confused with "Owner.FavoriteFood"!). Then, let's use IncludeMany(), which is an extension function static readonly Func<Entities, int, string, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, string, Dog>((ctx, id, include) => (from dog in ctx.DogSet.IncludeMany(include) where dog.ID == id select dog).FirstOrDefault()); Wrong again, this time it is because the EF cannot parse IncludeMany because it is not part of the functions that is recognizes: it is an extension. Ok, so you want to pass an arbitrary number of paths to your function and Includes() only takes a single one. What to do? You could decide that you will never ever need more than, say 20 Includes, and pass each separated strings in a struct to CompiledQuery. But now the query looks like this: from dog in ctx.DogSet.Include(include1).Include(include2).Include(include3) .Include(include4).Include(include5).Include(include6) .[...].Include(include19).Include(include20) where dog.ID == id select dog which is awful as well. Ok, then, but wait a minute. Can't we return an ObjectQuery< with CompiledQuery? Then set the includes on that? Well, that what I would have thought so as well: static readonly Func<Entities, int, ObjectQuery<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, ObjectQuery<Dog>>((ctx, id) => (ObjectQuery<Dog>)(from dog in ctx.DogSet where dog.ID == id select dog)); public Dog GetDog( int id, string include ) { ObjectQuery<Dog> oQuery = query_GetDog(id); oQuery = oQuery.IncludeMany(include); return oQuery.FirstOrDefault; } That should have worked, except that when you call IncludeMany (or Include, Where, OrderBy...) you invalidate the cached compiled query because it is an entirely new one now! So, the expression tree needs to be reparsed and you get that performance hit again. So what is the solution? You simply cannot use CompiledQueries with parametrized Includes. Use EntitySQL instead. This doesn't mean that there aren't uses for CompiledQueries. It is great for localized queries that will always be called in the same context. Ideally CompiledQuery should always be used because the syntax is checked at compile time, but due to limitation, that's not possible. An example of use would be: you may want to have a page that queries which two dogs have the same favorite food, which is a bit narrow for a BusinessLayer function, so you put it in your page and know exactly what type of includes are required. Passing more than 3 parameters to a CompiledQuery Func is limited to 5 parameters, of which the last one is the return type and the first one is your Entities object from the model. So that leaves you with 3 parameters. A pitance, but it can be improved on very easily. public struct MyParams { public string param1; public int param2; public DateTime param3; } static readonly Func<Entities, MyParams, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, MyParams, IEnumerable<Dog>>((ctx, myParams) => from dog in ctx.DogSet where dog.Age == myParams.param2 && dog.Name == myParams.param1 and dog.BirthDate > myParams.param3 select dog); public List<Dog> GetSomeDogs( int age, string Name, DateTime birthDate ) { MyParams myParams = new MyParams(); myParams.param1 = name; myParams.param2 = age; myParams.param3 = birthDate; return query_GetDog(YourContext,myParams).ToList(); } Return Types (this does not apply to EntitySQL queries as they aren't compiled at the same time during execution as the CompiledQuery method) Working with Linq, you usually don't force the execution of the query until the very last moment, in case some other functions downstream wants to change the query in some way: static readonly Func<Entities, int, string, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, IEnumerable<Dog>>((ctx, age, name) => from dog in ctx.DogSet where dog.Age == age && dog.Name == name select dog); public IEnumerable<Dog> GetSomeDogs( int age, string name ) { return query_GetDog(YourContext,age,name); } public void DataBindStuff() { IEnumerable<Dog> dogs = GetSomeDogs(4,"Bud"); // but I want the dogs ordered by BirthDate gridView.DataSource = dogs.OrderBy( it => it.BirthDate ); } What is going to happen here? By still playing with the original ObjectQuery (that is the actual return type of the Linq statement, which implements IEnumerable), it will invalidate the compiled query and be force to re-parse. So, the rule of thumb is to return a List< of objects instead. static readonly Func<Entities, int, string, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, IEnumerable<Dog>>((ctx, age, name) => from dog in ctx.DogSet where dog.Age == age && dog.Name == name select dog); public List<Dog> GetSomeDogs( int age, string name ) { return query_GetDog(YourContext,age,name).ToList(); //<== change here } public void DataBindStuff() { List<Dog> dogs = GetSomeDogs(4,"Bud"); // but I want the dogs ordered by BirthDate gridView.DataSource = dogs.OrderBy( it => it.BirthDate ); } When you call ToList(), the query gets executed as per the compiled query and then, later, the OrderBy is executed against the objects in memory. It may be a little bit slower, but I'm not even sure. One sure thing is that you have no worries about mis-handling the ObjectQuery and invalidating the compiled query plan. Once again, that is not a blanket statement. ToList() is a defensive programming trick, but if you have a valid reason not to use ToList(), go ahead. There are many cases in which you would want to refine the query before executing it. Performance What is the performance impact of compiling a query? It can actually be fairly large. A rule of thumb is that compiling and caching the query for reuse takes at least double the time of simply executing it without caching. For complex queries (read inherirante), I have seen upwards to 10 seconds. So, the first time a pre-compiled query gets called, you get a performance hit. After that first hit, performance is noticeably better than the same non-pre-compiled query. Practically the same as Linq2Sql When you load a page with pre-compiled queries the first time you will get a hit. It will load in maybe 5-15 seconds (obviously more than one pre-compiled queries will end up being called), while subsequent loads will take less than 300ms. Dramatic difference, and it is up to you to decide if it is ok for your first user to take a hit or you want a script to call your pages to force a compilation of the queries. Can this query be cached? { Dog dog = from dog in YourContext.DogSet where dog.ID == id select dog; } No, ad-hoc Linq queries are not cached and you will incur the cost of generating the tree every single time you call it. Parametrized Queries Most search capabilities involve heavily parametrized queries. There are even libraries available that will let you build a parametrized query out of lamba expressions. The problem is that you cannot use pre-compiled queries with those. One way around that is to map out all the possible criteria in the query and flag which one you want to use: public struct MyParams { public string name; public bool checkName; public int age; public bool checkAge; } static readonly Func<Entities, MyParams, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, MyParams, IEnumerable<Dog>>((ctx, myParams) => from dog in ctx.DogSet where (myParams.checkAge == true && dog.Age == myParams.age) && (myParams.checkName == true && dog.Name == myParams.name ) select dog); protected List<Dog> GetSomeDogs() { MyParams myParams = new MyParams(); myParams.name = "Bud"; myParams.checkName = true; myParams.age = 0; myParams.checkAge = false; return query_GetDog(YourContext,myParams).ToList(); } The advantage here is that you get all the benifits of a pre-compiled quert. The disadvantages are that you most likely will end up with a where clause that is pretty difficult to maintain, that you will incur a bigger penalty for pre-compiling the query and that each query you run is not as efficient as it could be (particularly with joins thrown in). Another way is to build an EntitySQL query piece by piece, like we all did with SQL. protected List<Dod> GetSomeDogs( string name, int age) { string query = "select value dog from Entities.DogSet where 1 = 1 "; if( !String.IsNullOrEmpty(name) ) query = query + " and dog.Name == @Name "; if( age > 0 ) query = query + " and dog.Age == @Age "; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>( query, YourContext ); if( !String.IsNullOrEmpty(name) ) oQuery.Parameters.Add( new ObjectParameter( "Name", name ) ); if( age > 0 ) oQuery.Parameters.Add( new ObjectParameter( "Age", age ) ); return oQuery.ToList(); } Here the problems are: - there is no syntax checking during compilation - each different combination of parameters generate a different query which will need to be pre-compiled when it is first run. In this case, there are only 4 different possible queries (no params, age-only, name-only and both params), but you can see that there can be way more with a normal world search. - Noone likes to concatenate strings! Another option is to query a large subset of the data and then narrow it down in memory. This is particularly useful if you are working with a definite subset of the data, like all the dogs in a city. You know there are a lot but you also know there aren't that many... so your CityDog search page can load all the dogs for the city in memory, which is a single pre-compiled query and then refine the results protected List<Dod> GetSomeDogs( string name, int age, string city) { string query = "select value dog from Entities.DogSet where dog.Owner.Address.City == @City "; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>( query, YourContext ); oQuery.Parameters.Add( new ObjectParameter( "City", city ) ); List<Dog> dogs = oQuery.ToList(); if( !String.IsNullOrEmpty(name) ) dogs = dogs.Where( it => it.Name == name ); if( age > 0 ) dogs = dogs.Where( it => it.Age == age ); return dogs; } It is particularly useful when you start displaying all the data then allow for filtering. Problems: - Could lead to serious data transfer if you are not careful about your subset. - You can only filter on the data that you returned. It means that if you don't return the Dog.Owner association, you will not be able to filter on the Dog.Owner.Name So what is the best solution? There isn't any. You need to pick the solution that works best for you and your problem: - Use lambda-based query building when you don't care about pre-compiling your queries. - Use fully-defined pre-compiled Linq query when your object structure is not too complex. - Use EntitySQL/string concatenation when the structure could be complex and when the possible number of different resulting queries are small (which means fewer pre-compilation hits). - Use in-memory filtering when you are working with a smallish subset of the data or when you had to fetch all of the data on the data at first anyway (if the performance is fine with all the data, then filtering in memory will not cause any time to be spent in the db). Singleton access The best way to deal with your context and entities accross all your pages is to use the singleton pattern: public sealed class YourContext { private const string instanceKey = "On3GoModelKey"; YourContext(){} public static YourEntities Instance { get { HttpContext context = HttpContext.Current; if( context == null ) return Nested.instance; if (context.Items[instanceKey] == null) { On3GoEntities entity = new On3GoEntities(); context.Items[instanceKey] = entity; } return (YourEntities)context.Items[instanceKey]; } } class Nested { // Explicit static constructor to tell C# compiler // not to mark type as beforefieldinit static Nested() { } internal static readonly YourEntities instance = new YourEntities(); } } NoTracking, is it worth it? When executing a query, you can tell the framework to track the objects it will return or not. What does it mean? With tracking enabled (the default option), the framework will track what is going on with the object (has it been modified? Created? Deleted?) and will also link objects together, when further queries are made from the database, which is what is of interest here. For example, lets assume that Dog with ID == 2 has an owner which ID == 10. Dog dog = (from dog in YourContext.DogSet where dog.ID == 2 select dog).FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; Person owner = (from o in YourContext.PersonSet where o.ID == 10 select dog).FirstOrDefault(); //dog.OwnerReference.IsLoaded == true; If we were to do the same with no tracking, the result would be different. ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>) (from dog in YourContext.DogSet where dog.ID == 2 select dog); oDogQuery.MergeOption = MergeOption.NoTracking; Dog dog = oDogQuery.FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; ObjectQuery<Person> oPersonQuery = (ObjectQuery<Person>) (from o in YourContext.PersonSet where o.ID == 10 select o); oPersonQuery.MergeOption = MergeOption.NoTracking; Owner owner = oPersonQuery.FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; Tracking is very useful and in a perfect world without performance issue, it would always be on. But in this world, there is a price for it, in terms of performance. So, should you use NoTracking to speed things up? It depends on what you are planning to use the data for. Is there any chance that the data your query with NoTracking can be used to make update/insert/delete in the database? If so, don't use NoTracking because associations are not tracked and will causes exceptions to be thrown. In a page where there are absolutly no updates to the database, you can use NoTracking. Mixing tracking and NoTracking is possible, but it requires you to be extra careful with updates/inserts/deletes. The problem is that if you mix then you risk having the framework trying to Attach() a NoTracking object to the context where another copy of the same object exist with tracking on. Basicly, what I am saying is that Dog dog1 = (from dog in YourContext.DogSet where dog.ID == 2).FirstOrDefault(); ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>) (from dog in YourContext.DogSet where dog.ID == 2 select dog); oDogQuery.MergeOption = MergeOption.NoTracking; Dog dog2 = oDogQuery.FirstOrDefault(); dog1 and dog2 are 2 different objects, one tracked and one not. Using the detached object in an update/insert will force an Attach() that will say "Wait a minute, I do already have an object here with the same database key. Fail". And when you Attach() one object, all of its hierarchy gets attached as well, causing problems everywhere. Be extra careful. How much faster is it with NoTracking It depends on the queries. Some are much more succeptible to tracking than other. I don't have a fast an easy rule for it, but it helps. So I should use NoTracking everywhere then? Not exactly. There are some advantages to tracking object. The first one is that the object is cached, so subsequent call for that object will not hit the database. That cache is only valid for the lifetime of the YourEntities object, which, if you use the singleton code above, is the same as the page lifetime. One page request == one YourEntity object. So for multiple calls for the same object, it will load only once per page request. (Other caching mechanism could extend that). What happens when you are using NoTracking and try to load the same object multiple times? The database will be queried each time, so there is an impact there. How often do/should you call for the same object during a single page request? As little as possible of course, but it does happens. Also remember the piece above about having the associations connected automatically for your? You don't have that with NoTracking, so if you load your data in multiple batches, you will not have a link to between them: ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>)(from dog in YourContext.DogSet select dog); oDogQuery.MergeOption = MergeOption.NoTracking; List<Dog> dogs = oDogQuery.ToList(); ObjectQuery<Person> oPersonQuery = (ObjectQuery<Person>)(from o in YourContext.PersonSet select o); oPersonQuery.MergeOption = MergeOption.NoTracking; List<Person> owners = oPersonQuery.ToList(); In this case, no dog will have its .Owner property set. Some things to keep in mind when you are trying to optimize the performance. No lazy loading, what am I to do? This can be seen as a blessing in disguise. Of course it is annoying to load everything manually. However, it decreases the number of calls to the db and forces you to think about when you should load data. The more you can load in one database call the better. That was always true, but it is enforced now with this 'feature' of EF. Of course, you can call if( !ObjectReference.IsLoaded ) ObjectReference.Load(); if you want to, but a better practice is to force the framework to load the objects you know you will need in one shot. This is where the discussion about parametrized Includes begins to make sense. Lets say you have you Dog object public class Dog { public Dog Get(int id) { return YourContext.DogSet.FirstOrDefault(it => it.ID == id ); } } This is the type of function you work with all the time. It gets called from all over the place and once you have that Dog object, you will do very different things to it in different functions. First, it should be pre-compiled, because you will call that very often. Second, each different pages will want to have access to a different subset of the Dog data. Some will want the Owner, some the FavoriteToy, etc. Of course, you could call Load() for each reference you need anytime you need one. But that will generate a call to the database each time. Bad idea. So instead, each page will ask for the data it wants to see when it first request for the Dog object: static public Dog Get(int id) { return GetDog(entity,"");} static public Dog Get(int id, string includePath) { string query = "select value o " + " from YourEntities.DogSet as o " +

Read the article
Table sorting & pagination with jQuery and Razor in ASP.NET MVC

- by hajan

Introduction jQuery enjoys living inside pages which are built on top of ASP.NET MVC Framework. The ASP.NET MVC is a place where things are organized very well and it is quite hard to make them dirty, especially because the pattern enforces you on purity (you can still make it dirty if you want so ;) ). We all know how easy is to build a HTML table with a header row, footer row and table rows showing some data. With ASP.NET MVC we can do this pretty easy, but, the result will be pure HTML table which only shows data, but does not includes sorting, pagination or some other advanced features that we were used to have in the ASP.NET WebForms GridView. Ok, there is the WebGrid MVC Helper, but what if we want to make something from pure table in our own clean style? In one of my recent projects, I’ve been using the jQuery tablesorter and tablesorter.pager plugins that go along. You don’t need to know jQuery to make this work… You need to know little CSS to create nice design for your table, but of course you can use mine from the demo… So, what you will see in this blog is how to attach this plugin to your pure html table and a div for pagination and make your table with advanced sorting and pagination features. Demo Project Resources The resources I’m using for this demo project are shown in the following solution explorer window print screen: Content/images – folder that contains all the up/down arrow images, pagination buttons etc. You can freely replace them with your own, but keep the names the same if you don’t want to change anything in the CSS we will built later. Content/Site.css – The main css theme, where we will add the theme for our table too Controllers/HomeController.cs – The controller I’m using for this project Models/Person.cs – For this demo, I’m using Person.cs class Scripts – jquery-1.4.4.min.js, jquery.tablesorter.js, jquery.tablesorter.pager.js – required script to make the magic happens Views/Home/Index.cshtml – Index view (razor view engine) the other items are not important for the demo. ASP.NET MVC 1. Model In this demo I use only one Person class which defines Person entity with several properties. You can use your own model, maybe one which will access data from database or any other resource. Person.cs public class Person { public string Name { get; set; } public string Surname { get; set; } public string Email { get; set; } public int? Phone { get; set; } public DateTime? DateAdded { get; set; } public int? Age { get; set; } public Person(string name, string surname, string email, int? phone, DateTime? dateadded, int? age) { Name = name; Surname = surname; Email = email; Phone = phone; DateAdded = dateadded; Age = age; } } 2. View In our example, we have only one Index.chtml page where Razor View engine is used. Razor view engine is my favorite for ASP.NET MVC because it’s very intuitive, fluid and keeps your code clean. 3. Controller Since this is simple example with one page, we use one HomeController.cs where we have two methods, one of ActionResult type (Index) and another GetPeople() used to create and return list of people. HomeController.cs public class HomeController : Controller { // // GET: /Home/ public ActionResult Index() { ViewBag.People = GetPeople(); return View(); } public List<Person> GetPeople() { List<Person> listPeople = new List<Person>(); listPeople.Add(new Person("Hajan", "Selmani", "[email protected]", 070070070,DateTime.Now, 25)); listPeople.Add(new Person("Straight", "Dean", "[email protected]", 123456789, DateTime.Now.AddDays(-5), 35)); listPeople.Add(new Person("Karsen", "Livia", "[email protected]", 46874651, DateTime.Now.AddDays(-2), 31)); listPeople.Add(new Person("Ringer", "Anne", "[email protected]", null, DateTime.Now, null)); listPeople.Add(new Person("O'Leary", "Michael", "[email protected]", 32424344, DateTime.Now, 44)); listPeople.Add(new Person("Gringlesby", "Anne", "[email protected]", null, DateTime.Now.AddDays(-9), 18)); listPeople.Add(new Person("Locksley", "Stearns", "[email protected]", 2135345, DateTime.Now, null)); listPeople.Add(new Person("DeFrance", "Michel", "[email protected]", 235325352, DateTime.Now.AddDays(-18), null)); listPeople.Add(new Person("White", "Johnson", null, null, DateTime.Now.AddDays(-22), 55)); listPeople.Add(new Person("Panteley", "Sylvia", null, 23233223, DateTime.Now.AddDays(-1), 32)); listPeople.Add(new Person("Blotchet-Halls", "Reginald", null, 323243423, DateTime.Now, 26)); listPeople.Add(new Person("Merr", "South", "[email protected]", 3232442, DateTime.Now.AddDays(-5), 85)); listPeople.Add(new Person("MacFeather", "Stearns", "[email protected]", null, DateTime.Now, null)); return listPeople; } } TABLE CSS/HTML DESIGN Now, lets start with the implementation. First of all, lets create the table structure and the main CSS. 1. HTML Structure @{ Layout = null; } <!DOCTYPE html> <html> <head> <title>ASP.NET & jQuery</title>  </head> <body> <div> <table class="tablesorter"> <thead> <tr> <th> value </th> </tr> </thead> <tbody> <tr> <td>value</td> </tr> </tbody> <tfoot> <tr> <th> value </th> </tr> </tfoot> </table> <div id="pager"> </div> </div> </body> </html> So, this is the main structure you need to create for each of your tables where you want to apply the functionality we will create. Of course the scripts are referenced once ;). As you see, our table has class tablesorter and also we have a div with id pager. In the next steps we will use both these to create the needed functionalities. The complete Index.cshtml coded to get the data from controller and display in the page is: <body> <div> <table class="tablesorter"> <thead> <tr> <th>Name</th> <th>Surname</th> <th>Email</th> <th>Phone</th> <th>Date Added</th> </tr> </thead> <tbody> @{ foreach (var p in ViewBag.People) { <tr> <td>@p.Name</td> <td>@p.Surname</td> <td>@p.Email</td> <td>@p.Phone</td> <td>@p.DateAdded</td> </tr> } } </tbody> <tfoot> <tr> <th>Name</th> <th>Surname</th> <th>Email</th> <th>Phone</th> <th>Date Added</th> </tr> </tfoot> </table> <div id="pager" style="position: none;"> <form> <img src="@Url.Content("~/Content/images/first.png")" class="first" /> <img src="@Url.Content("~/Content/images/prev.png")" class="prev" /> <input type="text" class="pagedisplay" /> <img src="@Url.Content("~/Content/images/next.png")" class="next" /> <img src="@Url.Content("~/Content/images/last.png")" class="last" /> <select class="pagesize"> <option selected="selected" value="5">5</option> <option value="10">10</option> <option value="20">20</option> <option value="30">30</option> <option value="40">40</option> </select> </form> </div> </div> </body> So, mainly the structure is the same. I have added @Razor code to create table with data retrieved from the ViewBag.People which has been filled with data in the home controller. 2. CSS Design The CSS code I’ve created is: /* DEMO TABLE */ body { font-size: 75%; font-family: Verdana, Tahoma, Arial, "Helvetica Neue", Helvetica, Sans-Serif; color: #232323; background-color: #fff; } table { border-spacing:0; border:1px solid gray;} table.tablesorter thead tr .header { background-image: url(images/bg.png); background-repeat: no-repeat; background-position: center right; cursor: pointer; } table.tablesorter tbody td { color: #3D3D3D; padding: 4px; background-color: #FFF; vertical-align: top; } table.tablesorter tbody tr.odd td { background-color:#F0F0F6; } table.tablesorter thead tr .headerSortUp { background-image: url(images/asc.png); } table.tablesorter thead tr .headerSortDown { background-image: url(images/desc.png); } table th { width:150px; border:1px outset gray; background-color:#3C78B5; color:White; cursor:pointer; } table thead th:hover { background-color:Yellow; color:Black;} table td { width:150px; border:1px solid gray;} PAGINATION AND SORTING Now, when everything is ready and we have the data, lets make pagination and sorting functionalities 1. jQuery Scripts referencing <link href="@Url.Content("~/Content/Site.css")" rel="stylesheet" type="text/css" /> <script src="@Url.Content("~/Scripts/jquery-1.4.4.min.js")" type="text/javascript"></script> <script src="@Url.Content("~/Scripts/jquery.tablesorter.js")" type="text/javascript"></script> <script src="@Url.Content("~/Scripts/jquery.tablesorter.pager.js")" type="text/javascript"></script> 2. jQuery Sorting and Pagination script <script type="text/javascript"> $(function () { $("table.tablesorter").tablesorter({ widthFixed: true, sortList: [[0, 0]] }) .tablesorterPager({ container: $("#pager"), size: $(".pagesize option:selected").val() }); }); </script> So, with only two lines of code, I’m using both tablesorter and tablesorterPager plugins, giving some options to both these. Options added: tablesorter - widthFixed: true – gives fixed width of the columns tablesorter - sortList[[0,0]] – An array of instructions for per-column sorting and direction in the format: [[columnIndex, sortDirection], ... ] where columnIndex is a zero-based index for your columns left-to-right and sortDirection is 0 for Ascending and 1 for Descending. A valid argument that sorts ascending first by column 1 and then column 2 looks like: [[0,0],[1,0]] (source: http://tablesorter.com/docs/) tablesorterPager – container: $(“#pager”) – tells the pager container, the div with id pager in our case. tablesorterPager – size: the default size of each page, where I get the default value selected, so if you put selected to any other of the options in your select list, you will have this number of rows as default per page for the table too. END RESULTS 1. Table once the page is loaded (default results per page is 5 and is automatically sorted by 1st column as sortList is specified) 2. Sorted by Phone Descending 3. Changed pagination to 10 items per page 4. Sorted by Phone and Name (use SHIFT to sort on multiple columns) 5. Sorted by Date Added 6. Page 3, 5 items per page ADDITIONAL ENHANCEMENTS We can do additional enhancements to the table. We can make search for each column. I will cover this in one of my next blogs. Stay tuned. DEMO PROJECT You can download demo project source code from HERE.CONCLUSION Once you finish with the demo, run your page and open the source code. You will be amazed of the purity of your code.Working with pagination in client side can be very useful. One of the benefits is performance, but if you have thousands of rows in your tables, you will get opposite result when talking about performance. Hence, sometimes it is nice idea to make pagination on back-end. So, the compromise between both approaches would be best to combine both of them. I use at most up to 500 rows on client-side and once the user reach the last page, we can trigger ajax postback which can get the next 500 rows using server-side pagination of the same data. I would like to recommend the following blog post http://weblogs.asp.net/gunnarpeipman/archive/2010/09/14/returning-paged-results-from-repositories-using-pagedresult-lt-t-gt.aspx, which will help you understand how to return page results from repository. I hope this was helpful post for you. Wait for my next posts ;). Please do let me know your feedback. Best Regards, Hajan

Read the article
value types in the vm

- by john.rose

value types in the vm p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} p.p2 {margin: 0.0px 0.0px 14.0px 0.0px; font: 14.0px Times} p.p3 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times} p.p4 {margin: 0.0px 0.0px 15.0px 0.0px; font: 14.0px Times} p.p5 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier} p.p6 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier; min-height: 17.0px} p.p7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p8 {margin: 0.0px 0.0px 0.0px 36.0px; text-indent: -36.0px; font: 14.0px Times; min-height: 18.0px} p.p9 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p10 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; color: #000000} li.li1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} li.li7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} span.s1 {font: 14.0px Courier} span.s2 {color: #000000} span.s3 {font: 14.0px Courier; color: #000000} ol.ol1 {list-style-type: decimal} Or, enduring values for a changing world. Introduction A value type is a data type which, generally speaking, is designed for being passed by value in and out of methods, and stored by value in data structures. The only value types which the Java language directly supports are the eight primitive types. Java indirectly and approximately supports value types, if they are implemented in terms of classes. For example, both Integer and String may be viewed as value types, especially if their usage is restricted to avoid operations appropriate to Object. In this note, we propose a definition of value types in terms of a design pattern for Java classes, accompanied by a set of usage restrictions. We also sketch the relation of such value types to tuple types (which are a JVM-level notion), and point out JVM optimizations that can apply to value types. This note is a thought experiment to extend the JVM’s performance model in support of value types. The demonstration has two phases. Initially the extension can simply use design patterns, within the current bytecode architecture, and in today’s Java language. But if the performance model is to be realized in practice, it will probably require new JVM bytecode features, changes to the Java language, or both. We will look at a few possibilities for these new features. An Axiom of Value In the context of the JVM, a value type is a data type equipped with construction, assignment, and equality operations, and a set of typed components, such that, whenever two variables of the value type produce equal corresponding values for their components, the values of the two variables cannot be distinguished by any JVM operation. Here are some corollaries: A value type is immutable, since otherwise a copy could be constructed and the original could be modified in one of its components, allowing the copies to be distinguished. Changing the component of a value type requires construction of a new value. The equals and hashCode operations are strictly component-wise. If a value type is represented by a JVM reference, that reference cannot be successfully synchronized on, and cannot be usefully compared for reference equality. A value type can be viewed in terms of what it doesn’t do. We can say that a value type omits all value-unsafe operations, which could violate the constraints on value types. These operations, which are ordinarily allowed for Java object types, are pointer equality comparison (the acmp instruction), synchronization (the monitor instructions), all the wait and notify methods of class Object, and non-trivial finalize methods. The clone method is also value-unsafe, although for value types it could be treated as the identity function. Finally, and most importantly, any side effect on an object (however visible) also counts as an value-unsafe operation. A value type may have methods, but such methods must not change the components of the value. It is reasonable and useful to define methods like toString, equals, and hashCode on value types, and also methods which are specifically valuable to users of the value type. Representations of Value Value types have two natural representations in the JVM, unboxed and boxed. An unboxed value consists of the components, as simple variables. For example, the complex number x=(1+2i), in rectangular coordinate form, may be represented in unboxed form by the following pair of variables: /*Complex x = Complex.valueOf(1.0, 2.0):*/ double x_re = 1.0, x_im = 2.0; These variables might be locals, parameters, or fields. Their association as components of a single value is not defined to the JVM. Here is a sample computation which computes the norm of the difference between two complex numbers: double distance(/*Complex x:*/ double x_re, double x_im, /*Complex y:*/ double y_re, double y_im) { /*Complex z = x.minus(y):*/ double z_re = x_re - y_re, z_im = x_im - y_im; /*return z.abs():*/ return Math.sqrt(z_re*z_re + z_im*z_im); } A boxed representation groups component values under a single object reference. The reference is to a ‘wrapper class’ that carries the component values in its fields. (A primitive type can naturally be equated with a trivial value type with just one component of that type. In that view, the wrapper class Integer can serve as a boxed representation of value type int.) The unboxed representation of complex numbers is practical for many uses, but it fails to cover several major use cases: return values, array elements, and generic APIs. The two components of a complex number cannot be directly returned from a Java function, since Java does not support multiple return values. The same story applies to array elements: Java has no ’array of structs’ feature. (Double-length arrays are a possible workaround for complex numbers, but not for value types with heterogeneous components.) By generic APIs I mean both those which use generic types, like Arrays.asList and those which have special case support for primitive types, like String.valueOf and PrintStream.println. Those APIs do not support unboxed values, and offer some problems to boxed values. Any ’real’ JVM type should have a story for returns, arrays, and API interoperability. The basic problem here is that value types fall between primitive types and object types. Value types are clearly more complex than primitive types, and object types are slightly too complicated. Objects are a little bit dangerous to use as value carriers, since object references can be compared for pointer equality, and can be synchronized on. Also, as many Java programmers have observed, there is often a performance cost to using wrapper objects, even on modern JVMs. Even so, wrapper classes are a good starting point for talking about value types. If there were a set of structural rules and restrictions which would prevent value-unsafe operations on value types, wrapper classes would provide a good notation for defining value types. This note attempts to define such rules and restrictions. Let’s Start Coding Now it is time to look at some real code. Here is a definition, written in Java, of a complex number value type. @ValueSafe public final class Complex implements java.io.Serializable { // immutable component structure: public final double re, im; private Complex(double re, double im) { this.re = re; this.im = im; } // interoperability methods: public String toString() { return "Complex("+re+","+im+")"; } public List<Double> asList() { return Arrays.asList(re, im); } public boolean equals(Complex c) { return re == c.re && im == c.im; } public boolean equals(@ValueSafe Object x) { return x instanceof Complex && equals((Complex) x); } public int hashCode() { return 31*Double.valueOf(re).hashCode() + Double.valueOf(im).hashCode(); } // factory methods: public static Complex valueOf(double re, double im) { return new Complex(re, im); } public Complex changeRe(double re2) { return valueOf(re2, im); } public Complex changeIm(double im2) { return valueOf(re, im2); } public static Complex cast(@ValueSafe Object x) { return x == null ? ZERO : (Complex) x; } // utility methods and constants: public Complex plus(Complex c) { return new Complex(re+c.re, im+c.im); } public Complex minus(Complex c) { return new Complex(re-c.re, im-c.im); } public double abs() { return Math.sqrt(re*re + im*im); } public static final Complex PI = valueOf(Math.PI, 0.0); public static final Complex ZERO = valueOf(0.0, 0.0); } This is not a minimal definition, because it includes some utility methods and other optional parts. The essential elements are as follows: The class is marked as a value type with an annotation. The class is final, because it does not make sense to create subclasses of value types. The fields of the class are all non-private and final. (I.e., the type is immutable and structurally transparent.) From the supertype Object, all public non-final methods are overridden. The constructor is private. Beyond these bare essentials, we can observe the following features in this example, which are likely to be typical of all value types: One or more factory methods are responsible for value creation, including a component-wise valueOf method. There are utility methods for complex arithmetic and instance creation, such as plus and changeIm. There are static utility constants, such as PI. The type is serializable, using the default mechanisms. There are methods for converting to and from dynamically typed references, such as asList and cast. The Rules In order to use value types properly, the programmer must avoid value-unsafe operations. A helpful Java compiler should issue errors (or at least warnings) for code which provably applies value-unsafe operations, and should issue warnings for code which might be correct but does not provably avoid value-unsafe operations. No such compilers exist today, but to simplify our account here, we will pretend that they do exist. A value-safe type is any class, interface, or type parameter marked with the @ValueSafe annotation, or any subtype of a value-safe type. If a value-safe class is marked final, it is in fact a value type. All other value-safe classes must be abstract. The non-static fields of a value class must be non-public and final, and all its constructors must be private. Under the above rules, a standard interface could be helpful to define value types like Complex. Here is an example: @ValueSafe public interface ValueType extends java.io.Serializable { // All methods listed here must get redefined. // Definitions must be value-safe, which means // they may depend on component values only. List<? extends Object> asList(); int hashCode(); boolean equals(@ValueSafe Object c); String toString(); } //@ValueSafe inherited from supertype: public final class Complex implements ValueType { … The main advantage of such a conventional interface is that (unlike an annotation) it is reified in the runtime type system. It could appear as an element type or parameter bound, for facilities which are designed to work on value types only. More broadly, it might assist the JVM to perform dynamic enforcement of the rules for value types. Besides types, the annotation @ValueSafe can mark fields, parameters, local variables, and methods. (This is redundant when the type is also value-safe, but may be useful when the type is Object or another supertype of a value type.) Working forward from these annotations, an expression E is defined as value-safe if it satisfies one or more of the following: The type of E is a value-safe type. E names a field, parameter, or local variable whose declaration is marked @ValueSafe. E is a call to a method whose declaration is marked @ValueSafe. E is an assignment to a value-safe variable, field reference, or array reference. E is a cast to a value-safe type from a value-safe expression. E is a conditional expression E0 ? E1 : E2, and both E1 and E2 are value-safe. Assignments to value-safe expressions and initializations of value-safe names must take their values from value-safe expressions. A value-safe expression may not be the subject of a value-unsafe operation. In particular, it cannot be synchronized on, nor can it be compared with the “==” operator, not even with a null or with another value-safe type. In a program where all of these rules are followed, no value-type value will be subject to a value-unsafe operation. Thus, the prime axiom of value types will be satisfied, that no two value type will be distinguishable as long as their component values are equal. More Code To illustrate these rules, here are some usage examples for Complex: Complex pi = Complex.valueOf(Math.PI, 0); Complex zero = pi.changeRe(0); //zero = pi; zero.re = 0; ValueType vtype = pi; @SuppressWarnings("value-unsafe") Object obj = pi; @ValueSafe Object obj2 = pi; obj2 = new Object(); // ok List<Complex> clist = new ArrayList<Complex>(); clist.add(pi); // (ok assuming List.add param is @ValueSafe) List<ValueType> vlist = new ArrayList<ValueType>(); vlist.add(pi); // (ok) List<Object> olist = new ArrayList<Object>(); olist.add(pi); // warning: "value-unsafe" boolean z = pi.equals(zero); boolean z1 = (pi == zero); // error: reference comparison on value type boolean z2 = (pi == null); // error: reference comparison on value type boolean z3 = (pi == obj2); // error: reference comparison on value type synchronized (pi) { } // error: synch of value, unpredictable result synchronized (obj2) { } // unpredictable result Complex qq = pi; qq = null; // possible NPE; warning: “null-unsafe" qq = (Complex) obj; // warning: “null-unsafe" qq = Complex.cast(obj); // OK @SuppressWarnings("null-unsafe") Complex empty = null; // possible NPE qq = empty; // possible NPE (null pollution) The Payoffs It follows from this that either the JVM or the java compiler can replace boxed value-type values with unboxed ones, without affecting normal computations. Fields and variables of value types can be split into their unboxed components. Non-static methods on value types can be transformed into static methods which take the components as value parameters. Some common questions arise around this point in any discussion of value types. Why burden the programmer with all these extra rules? Why not detect programs automagically and perform unboxing transparently? The answer is that it is easy to break the rules accidently unless they are agreed to by the programmer and enforced. Automatic unboxing optimizations are tantalizing but (so far) unreachable ideal. In the current state of the art, it is possible exhibit benchmarks in which automatic unboxing provides the desired effects, but it is not possible to provide a JVM with a performance model that assures the programmer when unboxing will occur. This is why I’m writing this note, to enlist help from, and provide assurances to, the programmer. Basically, I’m shooting for a good set of user-supplied “pragmas” to frame the desired optimization. Again, the important thing is that the unboxing must be done reliably, or else programmers will have no reason to work with the extra complexity of the value-safety rules. There must be a reasonably stable performance model, wherein using a value type has approximately the same performance characteristics as writing the unboxed components as separate Java variables. There are some rough corners to the present scheme. Since Java fields and array elements are initialized to null, value-type computations which incorporate uninitialized variables can produce null pointer exceptions. One workaround for this is to require such variables to be null-tested, and the result replaced with a suitable all-zero value of the value type. That is what the “cast” method does above. Generically typed APIs like List<T> will continue to manipulate boxed values always, at least until we figure out how to do reification of generic type instances. Use of such APIs will elicit warnings until their type parameters (and/or relevant members) are annotated or typed as value-safe. Retrofitting List<T> is likely to expose flaws in the present scheme, which we will need to engineer around. Here are a couple of first approaches: public interface java.util.List<@ValueSafe T> extends Collection<T> { … public interface java.util.List<T extends Object|ValueType> extends Collection<T> { … (The second approach would require disjunctive types, in which value-safety is “contagious” from the constituent types.) With more transformations, the return value types of methods can also be unboxed. This may require significant bytecode-level transformations, and would work best in the presence of a bytecode representation for multiple value groups, which I have proposed elsewhere under the title “Tuples in the VM”. But for starters, the JVM can apply this transformation under the covers, to internally compiled methods. This would give a way to express multiple return values and structured return values, which is a significant pain-point for Java programmers, especially those who work with low-level structure types favored by modern vector and graphics processors. The lack of multiple return values has a strong distorting effect on many Java APIs. Even if the JVM fails to unbox a value, there is still potential benefit to the value type. Clustered computing systems something have copy operations (serialization or something similar) which apply implicitly to command operands. When copying JVM objects, it is extremely helpful to know when an object’s identity is important or not. If an object reference is a copied operand, the system may have to create a proxy handle which points back to the original object, so that side effects are visible. Proxies must be managed carefully, and this can be expensive. On the other hand, value types are exactly those types which a JVM can “copy and forget” with no downside. Array types are crucial to bulk data interfaces. (As data sizes and rates increase, bulk data becomes more important than scalar data, so arrays are definitely accompanying us into the future of computing.) Value types are very helpful for adding structure to bulk data, so a successful value type mechanism will make it easier for us to express richer forms of bulk data. Unboxing arrays (i.e., arrays containing unboxed values) will provide better cache and memory density, and more direct data movement within clustered or heterogeneous computing systems. They require the deepest transformations, relative to today’s JVM. There is an impedance mismatch between value-type arrays and Java’s covariant array typing, so compromises will need to be struck with existing Java semantics. It is probably worth the effort, since arrays of unboxed value types are inherently more memory-efficient than standard Java arrays, which rely on dependent pointer chains. It may be sufficient to extend the “value-safe” concept to array declarations, and allow low-level transformations to change value-safe array declarations from the standard boxed form into an unboxed tuple-based form. Such value-safe arrays would not be convertible to Object[] arrays. Certain connection points, such as Arrays.copyOf and System.arraycopy might need additional input/output combinations, to allow smooth conversion between arrays with boxed and unboxed elements. Alternatively, the correct solution may have to wait until we have enough reification of generic types, and enough operator overloading, to enable an overhaul of Java arrays. Implicit Method Definitions The example of class Complex above may be unattractively complex. I believe most or all of the elements of the example class are required by the logic of value types. If this is true, a programmer who writes a value type will have to write lots of error-prone boilerplate code. On the other hand, I think nearly all of the code (except for the domain-specific parts like plus and minus) can be implicitly generated. Java has a rule for implicitly defining a class’s constructor, if no it defines no constructors explicitly. Likewise, there are rules for providing default access modifiers for interface members. Because of the highly regular structure of value types, it might be reasonable to perform similar implicit transformations on value types. Here’s an example of a “highly implicit” definition of a complex number type: public class Complex implements ValueType { // implicitly final public double re, im; // implicitly public final //implicit methods are defined elementwise from te fields: // toString, asList, equals(2), hashCode, valueOf, cast //optionally, explicit methods (plus, abs, etc.) would go here } In other words, with the right defaults, a simple value type definition can be a one-liner. The observant reader will have noticed the similarities (and suitable differences) between the explicit methods above and the corresponding methods for List<T>. Another way to abbreviate such a class would be to make an annotation the primary trigger of the functionality, and to add the interface(s) implicitly: public @ValueType class Complex { … // implicitly final, implements ValueType (But to me it seems better to communicate the “magic” via an interface, even if it is rooted in an annotation.) Implicitly Defined Value Types So far we have been working with nominal value types, which is to say that the sequence of typed components is associated with a name and additional methods that convey the intention of the programmer. A simple ordered pair of floating point numbers can be variously interpreted as (to name a few possibilities) a rectangular or polar complex number or Cartesian point. The name and the methods convey the intended meaning. But what if we need a truly simple ordered pair of floating point numbers, without any further conceptual baggage? Perhaps we are writing a method (like “divideAndRemainder”) which naturally returns a pair of numbers instead of a single number. Wrapping the pair of numbers in a nominal type (like “QuotientAndRemainder”) makes as little sense as wrapping a single return value in a nominal type (like “Quotient”). What we need here are structural value types commonly known as tuples. For the present discussion, let us assign a conventional, JVM-friendly name to tuples, roughly as follows: public class java.lang.tuple.$DD extends java.lang.tuple.Tuple { double $1, $2; } Here the component names are fixed and all the required methods are defined implicitly. The supertype is an abstract class which has suitable shared declarations. The name itself mentions a JVM-style method parameter descriptor, which may be “cracked” to determine the number and types of the component fields. The odd thing about such a tuple type (and structural types in general) is it must be instantiated lazily, in response to linkage requests from one or more classes that need it. The JVM and/or its class loaders must be prepared to spin a tuple type on demand, given a simple name reference, $xyz, where the xyz is cracked into a series of component types. (Specifics of naming and name mangling need some tasteful engineering.) Tuples also seem to demand, even more than nominal types, some support from the language. (This is probably because notations for non-nominal types work best as combinations of punctuation and type names, rather than named constructors like Function3 or Tuple2.) At a minimum, languages with tuples usually (I think) have some sort of simple bracket notation for creating tuples, and a corresponding pattern-matching syntax (or “destructuring bind”) for taking tuples apart, at least when they are parameter lists. Designing such a syntax is no simple thing, because it ought to play well with nominal value types, and also with pre-existing Java features, such as method parameter lists, implicit conversions, generic types, and reflection. That is a task for another day. Other Use Cases Besides complex numbers and simple tuples there are many use cases for value types. Many tuple-like types have natural value-type representations. These include rational numbers, point locations and pixel colors, and various kinds of dates and addresses. Other types have a variable-length ‘tail’ of internal values. The most common example of this is String, which is (mathematically) a sequence of UTF-16 character values. Similarly, bit vectors, multiple-precision numbers, and polynomials are composed of sequences of values. Such types include, in their representation, a reference to a variable-sized data structure (often an array) which (somehow) represents the sequence of values. The value type may also include ’header’ information. Variable-sized values often have a length distribution which favors short lengths. In that case, the design of the value type can make the first few values in the sequence be direct ’header’ fields of the value type. In the common case where the header is enough to represent the whole value, the tail can be a shared null value, or even just a null reference. Note that the tail need not be an immutable object, as long as the header type encapsulates it well enough. This is the case with String, where the tail is a mutable (but never mutated) character array. Field types and their order must be a globally visible part of the API. The structure of the value type must be transparent enough to have a globally consistent unboxed representation, so that all callers and callees agree about the type and order of components that appear as parameters, return types, and array elements. This is a trade-off between efficiency and encapsulation, which is forced on us when we remove an indirection enjoyed by boxed representations. A JVM-only transformation would not care about such visibility, but a bytecode transformation would need to take care that (say) the components of complex numbers would not get swapped after a redefinition of Complex and a partial recompile. Perhaps constant pool references to value types need to declare the field order as assumed by each API user. This brings up the delicate status of private fields in a value type. It must always be possible to load, store, and copy value types as coordinated groups, and the JVM performs those movements by moving individual scalar values between locals and stack. If a component field is not public, what is to prevent hostile code from plucking it out of the tuple using a rogue aload or astore instruction? Nothing but the verifier, so we may need to give it more smarts, so that it treats value types as inseparable groups of stack slots or locals (something like long or double). My initial thought was to make the fields always public, which would make the security problem moot. But public is not always the right answer; consider the case of String, where the underlying mutable character array must be encapsulated to prevent security holes. I believe we can win back both sides of the tradeoff, by training the verifier never to split up the components in an unboxed value. Just as the verifier encapsulates the two halves of a 64-bit primitive, it can encapsulate the the header and body of an unboxed String, so that no code other than that of class String itself can take apart the values. Similar to String, we could build an efficient multi-precision decimal type along these lines: public final class DecimalValue extends ValueType { protected final long header; protected private final BigInteger digits; public DecimalValue valueOf(int value, int scale) { assert(scale >= 0); return new DecimalValue(((long)value << 32) + scale, null); } public DecimalValue valueOf(long value, int scale) { if (value == (int) value) return valueOf((int)value, scale); return new DecimalValue(-scale, new BigInteger(value)); } } Values of this type would be passed between methods as two machine words. Small values (those with a significand which fits into 32 bits) would be represented without any heap data at all, unless the DecimalValue itself were boxed. (Note the tension between encapsulation and unboxing in this case. It would be better if the header and digits fields were private, but depending on where the unboxing information must “leak”, it is probably safer to make a public revelation of the internal structure.) Note that, although an array of Complex can be faked with a double-length array of double, there is no easy way to fake an array of unboxed DecimalValues. (Either an array of boxed values or a transposed pair of homogeneous arrays would be reasonable fallbacks, in a current JVM.) Getting the full benefit of unboxing and arrays will require some new JVM magic. Although the JVM emphasizes portability, system dependent code will benefit from using machine-level types larger than 64 bits. For example, the back end of a linear algebra package might benefit from value types like Float4 which map to stock vector types. This is probably only worthwhile if the unboxing arrays can be packed with such values. More Daydreams A more finely-divided design for dynamic enforcement of value safety could feature separate marker interfaces for each invariant. An empty marker interface Unsynchronizable could cause suitable exceptions for monitor instructions on objects in marked classes. More radically, a Interchangeable marker interface could cause JVM primitives that are sensitive to object identity to raise exceptions; the strangest result would be that the acmp instruction would have to be specified as raising an exception. @ValueSafe public interface ValueType extends java.io.Serializable, Unsynchronizable, Interchangeable { … public class Complex implements ValueType { // inherits Serializable, Unsynchronizable, Interchangeable, @ValueSafe … It seems possible that Integer and the other wrapper types could be retro-fitted as value-safe types. This is a major change, since wrapper objects would be unsynchronizable and their references interchangeable. It is likely that code which violates value-safety for wrapper types exists but is uncommon. It is less plausible to retro-fit String, since the prominent operation String.intern is often used with value-unsafe code. We should also reconsider the distinction between boxed and unboxed values in code. The design presented above obscures that distinction. As another thought experiment, we could imagine making a first class distinction in the type system between boxed and unboxed representations. Since only primitive types are named with a lower-case initial letter, we could define that the capitalized version of a value type name always refers to the boxed representation, while the initial lower-case variant always refers to boxed. For example: complex pi = complex.valueOf(Math.PI, 0); Complex boxPi = pi; // convert to boxed myList.add(boxPi); complex z = myList.get(0); // unbox Such a convention could perhaps absorb the current difference between int and Integer, double and Double. It might also allow the programmer to express a helpful distinction among array types. As said above, array types are crucial to bulk data interfaces, but are limited in the JVM. Extending arrays beyond the present limitations is worth thinking about; for example, the Maxine JVM implementation has a hybrid object/array type. Something like this which can also accommodate value type components seems worthwhile. On the other hand, does it make sense for value types to contain short arrays? And why should random-access arrays be the end of our design process, when bulk data is often sequentially accessed, and it might make sense to have heterogeneous streams of data as the natural “jumbo” data structure. These considerations must wait for another day and another note. More Work It seems to me that a good sequence for introducing such value types would be as follows: Add the value-safety restrictions to an experimental version of javac. Code some sample applications with value types, including Complex and DecimalValue. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so. Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. A staggered roll-out like this would decouple language changes from bytecode changes, which is always a convenient thing. A similar investigation should be applied (concurrently) to array types. In this case, it seems to me that the starting point is in the JVM: Add an experimental unboxing array data structure to a production JVM, perhaps along the lines of Maxine hybrids. No bytecode or language support is required at first; everything can be done with encapsulated unsafe operations and/or method handles. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so. Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. That’s enough musing me for now. Back to work!

Read the article

Search Results

Search found 28590 results on 1144 pages for 'best'.

Page 956/1144 | < Previous Page | 952 953 954 955 956 957 958 959 960 961 962 963 | Next Page >

- by AlexV

- by CT

- by Hertfordian

- by MrDobalina

- by r_honey

- by 300 baud

- by Dustin

- by HannesFostie

- by John Gardeniers

- by Philipp Lenssen

- by Will

- by user2526699

- by ABrown

- by John Gardeniers

- by TheProfoundGeek

- by john G

- by complexgeek

- by mschultz

- by WebSolProv

- by HornedBeast

- by Juve

- by nat

- by AD

- by hajan

- by john.rose

< Previous Page | 952 953 954 955 956 957 958 959 960 961 962 963 | Next Page >