Search Results

Search found 66013 results on 2641 pages for 'big data analytics'.

Page 49/2641 | < Previous Page | 45 46 47 48 49 50 51 52 53 54 55 56  | Next Page >

  • SQL SERVER – Log File Growing for Model Database – model Database Log File Grew Too Big

    - by pinaldave
    After reading my earlier article SQL SERVER – master Database Log File Grew Too Big, I received an email recently from another reader asking why does the log file of model database grow every day when he is not carrying out any operation in the model database. As per the email, he is absolutely sure that he is doing nothing on his model database; he had used policy management to catch any T-SQL operation in the model database and there were none. This was indeed surprising to me. I sent a request to access to his server, which he happily agreed for and within a min, we figured out the issue. He was taking the backup of the model database every day taking the database backup every night. When I explained the same to him, he did not believe it; so I quickly wrote down the following script. The results before and after the usage of the script were very clear. What is a model database? The model database is used as the template for all databases created on an instance of SQL Server. Any object you create in the model database will be automatically created in subsequent user database created on the server. NOTE: Do not run this in production environment. During the demo, the model database was in full recovery mode and only full backup operation was performed (no log backup). Before Backup Script Backup Script in loop DECLARE @FLAG INT SET @FLAG = 1 WHILE(@FLAG < 1000) BEGIN BACKUP DATABASE [model] TO  DISK = N'D:\model.bak' SET @FLAG = @FLAG + 1 END GO After Backup Script Why did this happen? The model database was in full recovery mode and taking full backup is logged operation. As there was no log backup and only full backup was performed on the model database, the size of the log file kept growing. Resolution: Change the backup mode of model database from “Full Recovery” to “Simple Recovery.”. Take full backup of the model database “only” when you change something in the model database. Let me know if you have encountered a situation like this? If so, how did you resolve it? It will be interesting to know about your experience. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Best Upper Bound & Best Lower Bound of an Algorithm

    - by Nayefc
    I am studying for a final exam and I came past a question I had on an earlier test. The questions asks us to find the minimum value in an unsorted array of integers. We must provide the best upper bound and the best lower bound that you can for the problem in the worst case. First, in such an example, the upper and lower bound are the same (hence, we can talk in terms of Big-Theta). In the worst case, we would have to go through the whole list as the minimum value would be at the end of the list. Therefore, the answer is Big-Theta(n). Is this a correct & good explanation?

    Read the article

  • JSR 360 and JSR 361: A Big Leap for Java ME 8

    - by terrencebarr
    It might have gone unnoticed to some, but Java ME took a big leap forward a couple of weeks ago with the filing of two new JSRs: JSR 360: “Connected Limited Device Configuration 8″ (aka CLDC 8) JSR 361: “Java ME Embedded Profile” (aka ME EP) Together, these two JSRs will significantly update, enhance, and modernize the Java ME platform, and specifically small embedded Java, with a host of new features and functionality. JSR 360 – Connected Limited Device Configuration 8 CLDC 8 is based on JSR 139 (CLDC 1.1) and updates the core Java ME VM, language support, libraries, and features to be aligned with Java SE 8. This will include: VM updated to comply with the JVM language specification version 2 Support for SE 7/8 language features like Generics, Assertions, Annotations, Try-with-Resources, and more New libraries such as Collections, NIO subset, Logging API subset A consolidated and enhanced Generic Connection Framework for multi-protocol I/O With CLDC 8, Java ME and Java SE are entering their next phase of alignment – making Java the only technology today that truly scales application development, code re-use, and tooling across the whole range of IT platforms, from small embedded to large enterprise. JSR 361 – Java ME Embedded Profile ME EP is based on JSR 228 (IMP-NG) and updates the specification in key areas to provide a powerful and flexible application environment for small embedded Java platforms, building on the features of CLDC 8:  A new, lightweight component and services model Shared libraries Multi-application concurrency, inter-application communication, and event system Application management API optionality, to address low-footprint use cases With ME EP, application developers will have a modern application environment which allows development and deployment of  modular, robust, sophisticated, and footprint-optimized solutions for a wide range of embedded use cases and devices. Summary While these JSRs are still under development, it’s clear that there are exciting new times ahead for Java ME – turning into a serious application platform while maintaining the focus on resource-constrained devices to address the expected explosion of small, smart, and connected embedded platforms. To learn more, click on the above links for JSR 360 and JSR 361. Or review the JavaOne 2012 online presentations on the topic: CON11300: Expanding the reach of the Java ME Platform CON5943: Java ME 8 Service Platform And stay tuned for more in this space! Cheers, – Terrence Filed under: Mobile & Embedded Tagged: "jsr 360", "jsr 361", "me 8", embedded, Embedded Java, JCP

    Read the article

  • JSR 360 and JSR 361: A Big Leap for Java ME 8

    - by terrencebarr
    It might have gone unnoticed to some, but Java ME took a big leap forward a couple of weeks ago with the filing of two new JSRs: JSR 360: “Connected Limited Device Configuration 8″ (aka CLDC 8) JSR 361: “Java ME Embedded Profile” (aka ME EP) Together, these two JSRs will significantly update, enhance, and modernize the Java ME platform, and specifically small embedded Java, with a host of new features and functionality. JSR 360 – Connected Limited Device Configuration 8 CLDC 8 is based on JSR 139 (CLDC 1.1) and updates the core Java ME VM, language support, libraries, and features to be aligned with Java SE 8. This will include: VM updated to comply with the JVM language specification version 2 Support for SE 7/8 language features like Generics, Assertions, Annotations, Try-with-Resources, and more New libraries such as Collections, NIO subset, Logging API subset A consolidated and enhanced Generic Connection Framework for multi-protocol I/O With CLDC 8, Java ME and Java SE are entering their next phase of alignment – making Java the only technology today that truly scales application development, code re-use, and tooling across the whole range of IT platforms, from small embedded to large enterprise. JSR 361 – Java ME Embedded Profile ME EP is based on JSR 228 (IMP-NG) and updates the specification in key areas to provide a powerful and flexible application environment for small embedded Java platforms, building on the features of CLDC 8:  A new, lightweight component and services model Shared libraries Multi-application concurrency, inter-application communication, and event system Application management API optionality, to address low-footprint use cases With ME EP, application developers will have a modern application environment which allows development and deployment of  modular, robust, sophisticated, and footprint-optimized solutions for a wide range of embedded use cases and devices. Summary While these JSRs are still under development, it’s clear that there are exciting new times ahead for Java ME – turning into a serious application platform while maintaining the focus on resource-constrained devices to address the expected explosion of small, smart, and connected embedded platforms. To learn more, click on the above links for JSR 360 and JSR 361. Or review the JavaOne 2012 online presentations on the topic: CON11300: Expanding the reach of the Java ME Platform CON5943: Java ME 8 Service Platform And stay tuned for more in this space! Cheers, – Terrence Filed under: Mobile & Embedded Tagged: "jsr 360", "jsr 361", "me 8", embedded, Embedded Java, JCP

    Read the article

  • Refreshing imported MySQL data with MySQL for Excel

    - by Javier Rivera
    Welcome to another blog post from the MySQL for Excel Team. Today we're going to talk about a new feature included since MySQL for Excel 1.3.0, you can install the latest GA or maintenance version using the MySQL Installer or optionally you can download directly any GA or non-GA version from the MySQL Developer Zone.As some users suggested in our forums we should be maintaining the link between tables and Excel not only when editing data through the Edit MySQL Data option, but also when importing data via Import MySQL Data. Before 1.3.0 this process only provided you with an offline copy of the Table's data into Excel and you had no way to refresh that information from the DB later on. Now, with this new feature we'll show you how easy is to work with the latest available information at all times. This feature is transparent to you (it doesn't require additional steps to work as long as the users had the Create an Excel Table for the imported MySQL table data option enabled. To ensure you have this option checked, click over Advanced Options... after the Import Data dialog is displayed). The current blog post assumes you already know how to import data into excel, you could always take a look at our previous post How To - Guide to Importing Data from a MySQL Database to Excel using MySQL for Excel if you need further reference on that topic. After importing Data from a MySQL Table into Excel, you can refresh the data in 3 ways.1. Simply right click over the range of the imported data, to show the pop-up menu: Click over the Refresh button to obtain the latest copy of the data in the table. 2. Click the Refresh button on the Data ribbon: 3. Click the Refresh All button in the Data ribbon (beware this will refresh all Excel tables in the Workbook): Please take a note of a couple of details here, the first one is about the size of the table. If by the time you refresh the table new columns had been added to it, and you originally have imported all columns, the table will grow to the right. The same applies to rows, if the table has new rows and you did not limit the results , the table will grow to to the bottom of the sheet in Excel. The second detail you should take into account is this operation will overwrite any changes done to the cells after the table was originally imported or previously refreshed: Now with this new feature, imported data remains linked to the data source and is available to be updated at all times. It empowers the user to always be able to work with the latest version of the imported MySQL data. We hope you like this this new feature and give it a try! Remember that your feedback is very important for us, so drop us a message with your comments, suggestions for this or other features and follow us at our social media channels: MySQL on Windows (this) Blog: https://blogs.oracle.com/MySqlOnWindows/ MySQL for Excel forum: http://forums.mysql.com/list.php?172 Facebook: http://www.facebook.com/mysql YouTube channel: https://www.youtube.com/user/MySQLChannel Thanks!

    Read the article

  • Analyzing data from same tables in diferent db instances.

    - by Oscar Reyes
    Short version: How can I map two columns from table A and B if they both have a common identifier which in turn may have two values in column C Lets say: A --- 1 , 2 B --- ? , 3 C ----- 45, 2 45, 3 Using table C I know that id 2 and 3 belong to the same item ( 45 ) and thus "?" in table B should be 1. What query could do something like that? EDIT Long version ommited. It was really boring/confusing EDIT I'm posting some output here. From this query: select distinct( rolein) , activityin from taskperformance@dm_prod where activityin in ( select activityin from activities@dm_prod where activityid in ( select activityid from activities@dm_prod where activityin in ( select distinct( activityin ) from taskperformance where rolein = 0 ) ) ) I have the following parts: select distinct( activityin ) from taskperformance where rolein = 0 Output: http://question1337216.pastebin.com/f5039557 select activityin from activities@dm_prod where activityid in ( select activityid from activities@dm_prod where activityin in ( select distinct( activityin ) from taskperformance where rolein = 0 ) ) Output: http://question1337216.pastebin.com/f6cef9393 And finally: select distinct( rolein) , activityin from taskperformance@dm_prod where activityin in ( select activityin from activities@dm_prod where activityid in ( select activityid from activities@dm_prod where activityin in ( select distinct( activityin ) from taskperformance where rolein = 0 ) ) ) Output: http://question1337216.pastebin.com/f346057bd Take for instace activityin 335 from first query ( from taskperformance B) . It is present in actvities from A. But is not in taskperformace in A ( but a the related activities: 92, 208, 335, 595 ) Are present in the result. The corresponding role in is: 1

    Read the article

  • Good approach for hundreds of comsumers and big files

    - by ????? ???????
    I have several files (nearly 1GB each) with data. Data is a string line. I need to process each of these files with several hundreds of consumers. Each of these consumers does some processing that differs from others. Consumers do not write anywhere concurrently. They only need input string. After processing they update their local buffers. Consumers can easily be executed in parallel. Important: With one specific file each consumer has to process all lines (without skipping) in correct order (as they appear in file). The order of processing different files doesn't matter. Processing of a single line by one consumer is comparably fast. I expect less than 50 microseconds on Corei5. So now I'm looking for the good approach to this problem. This is going to be be a part of a .NET project, so please let's stick with .NET only (C# is preferable). I know about TPL and DataFlow. I guess that the most relevant would be BroadcastBlock. But i think that the problem here is that with each line I'll have to wait for all consumers to finish in order to post the new one. I guess that it would be not very efficient. I think that ideally situation would be something like this: One thread reads from file and writes to the buffer. Each consumer, when it is ready, reads the line from the buffer concurrently and processes it. The entry from the buffer shouldn't be deleted as one consumer reads it. It can be deleted only when all consumers have processed it. TPL schedules consumer threads itself. If one consumer outperforms the others, it shouldn't wait and can read more recent entries from the buffer. Am i right with this kind of approach? Whether yes or not, how can i implement the good solution? A bit was already discussed on StackOverflow: link

    Read the article

  • Extract wrong data from a frame in C?

    - by ipkiss
    I am writing a program that reads the data from the serial port on Linux. The data are sent by another device with the following frame format: |start | Command | Data | CRC | End | |0x02 | 0x41 | (0-127 octets) | | 0x03| ---------------------------------------------------- The Data field contains 127 octets as shown and octet 1,2 contains one type of data; octet 3,4 contains another data. I need to get these data. Because in C, one byte can only holds one character and in the start field of the frame, it is 0x02 which means STX which is 3 characters. So, in order to test my program, On the sender side, I construct an array as the frame formatted above like: char frame[254]; frame[0] = 0x02; // starting field frame[1] = 0x41; // command field which is character 'A' ..so on.. And, then On the receiver side, I take out the fields like: char result[254]; // read data read(result); printf("command = %c", result[1]); // get the command field of the frame // get other field's values the command field value (result[1]) is not character 'A'. I think, this because the first field value of the frame is 0x02 (STX) occupying 3 first places in the array frame and leading to the wrong results on the receiver side. How can I correct the issue or am I doing something wrong at the sender side? Thanks all. related questions: http://stackoverflow.com/questions/2500567/parse-and-read-data-frame-in-c http://stackoverflow.com/questions/2531779/clear-data-at-serial-port-in-linux-in-c

    Read the article

  • Use virtual pageviews for all goal tracking

    - by Jeff Wu
    I'm new to Google Analytics and I'm wondering if it would be cleaner to user virtual pageviews for all the goal tracking on my website instead of using a mix of regular page views and virtual pageviews. I know in most cases this is just semantics but there are multiple pages where the same goal can be achieved and I think it would be cleaner just to fire the same virtual pageview instead of having two different goal pages. Will this model also give developers more flexibility when they do development? I know we are moving to a CMS and urls can get hairy, so I think this might be a good way to make analytics portion of the site "future proof". Any thoughts are appreciated! Thanks.

    Read the article

  • Fraud Detection with the SQL Server Suite Part 1

    - by Dejan Sarka
    While working on different fraud detection projects, I developed my own approach to the solution for this problem. In my PASS Summit 2013 session I am introducing this approach. I also wrote a whitepaper on the same topic, which was generously reviewed by my friend Matija Lah. In order to spread this knowledge faster, I am starting a series of blog posts which will at the end make the whole whitepaper. Abstract With the massive usage of credit cards and web applications for banking and payment processing, the number of fraudulent transactions is growing rapidly and on a global scale. Several fraud detection algorithms are available within a variety of different products. In this paper, we focus on using the Microsoft SQL Server suite for this purpose. In addition, we will explain our original approach to solving the problem by introducing a continuous learning procedure. Our preferred type of service is mentoring; it allows us to perform the work and consulting together with transferring the knowledge onto the customer, thus making it possible for a customer to continue to learn independently. This paper is based on practical experience with different projects covering online banking and credit card usage. Introduction A fraud is a criminal or deceptive activity with the intention of achieving financial or some other gain. Fraud can appear in multiple business areas. You can find a detailed overview of the business domains where fraud can take place in Sahin Y., & Duman E. (2011), Detecting Credit Card Fraud by Decision Trees and Support Vector Machines, Proceedings of the International MultiConference of Engineers and Computer Scientists 2011 Vol 1. Hong Kong: IMECS. Dealing with frauds includes fraud prevention and fraud detection. Fraud prevention is a proactive mechanism, which tries to disable frauds by using previous knowledge. Fraud detection is a reactive mechanism with the goal of detecting suspicious behavior when a fraudster surpasses the fraud prevention mechanism. A fraud detection mechanism checks every transaction and assigns a weight in terms of probability between 0 and 1 that represents a score for evaluating whether a transaction is fraudulent or not. A fraud detection mechanism cannot detect frauds with a probability of 100%; therefore, manual transaction checking must also be available. With fraud detection, this manual part can focus on the most suspicious transactions. This way, an unchanged number of supervisors can detect significantly more frauds than could be achieved with traditional methods of selecting which transactions to check, for example with random sampling. There are two principal data mining techniques available both in general data mining as well as in specific fraud detection techniques: supervised or directed and unsupervised or undirected. Supervised techniques or data mining models use previous knowledge. Typically, existing transactions are marked with a flag denoting whether a particular transaction is fraudulent or not. Customers at some point in time do report frauds, and the transactional system should be capable of accepting such a flag. Supervised data mining algorithms try to explain the value of this flag by using different input variables. When the patterns and rules that lead to frauds are learned through the model training process, they can be used for prediction of the fraud flag on new incoming transactions. Unsupervised techniques analyze data without prior knowledge, without the fraud flag; they try to find transactions which do not resemble other transactions, i.e. outliers. In both cases, there should be more frauds in the data set selected for checking by using the data mining knowledge compared to selecting the data set with simpler methods; this is known as the lift of a model. Typically, we compare the lift with random sampling. The supervised methods typically give a much better lift than the unsupervised ones. However, we must use the unsupervised ones when we do not have any previous knowledge. Furthermore, unsupervised methods are useful for controlling whether the supervised models are still efficient. Accuracy of the predictions drops over time. Patterns of credit card usage, for example, change over time. In addition, fraudsters continuously learn as well. Therefore, it is important to check the efficiency of the predictive models with the undirected ones. When the difference between the lift of the supervised models and the lift of the unsupervised models drops, it is time to refine the supervised models. However, the unsupervised models can become obsolete as well. It is also important to measure the overall efficiency of both, supervised and unsupervised models, over time. We can compare the number of predicted frauds with the total number of frauds that include predicted and reported occurrences. For measuring behavior across time, specific analytical databases called data warehouses (DW) and on-line analytical processing (OLAP) systems can be employed. By controlling the supervised models with unsupervised ones and by using an OLAP system or DW reports to control both, a continuous learning infrastructure can be established. There are many difficulties in developing a fraud detection system. As has already been mentioned, fraudsters continuously learn, and the patterns change. The exchange of experiences and ideas can be very limited due to privacy concerns. In addition, both data sets and results might be censored, as the companies generally do not want to publically expose actual fraudulent behaviors. Therefore it can be quite difficult if not impossible to cross-evaluate the models using data from different companies and different business areas. This fact stresses the importance of continuous learning even more. Finally, the number of frauds in the total number of transactions is small, typically much less than 1% of transactions is fraudulent. Some predictive data mining algorithms do not give good results when the target state is represented with a very low frequency. Data preparation techniques like oversampling and undersampling can help overcome the shortcomings of many algorithms. SQL Server suite includes all of the software required to create, deploy any maintain a fraud detection infrastructure. The Database Engine is the relational database management system (RDBMS), which supports all activity needed for data preparation and for data warehouses. SQL Server Analysis Services (SSAS) supports OLAP and data mining (in version 2012, you need to install SSAS in multidimensional and data mining mode; this was the only mode in previous versions of SSAS, while SSAS 2012 also supports the tabular mode, which does not include data mining). Additional products from the suite can be useful as well. SQL Server Integration Services (SSIS) is a tool for developing extract transform–load (ETL) applications. SSIS is typically used for loading a DW, and in addition, it can use SSAS data mining models for building intelligent data flows. SQL Server Reporting Services (SSRS) is useful for presenting the results in a variety of reports. Data Quality Services (DQS) mitigate the occasional data cleansing process by maintaining a knowledge base. Master Data Services is an application that helps companies maintaining a central, authoritative source of their master data, i.e. the most important data to any organization. For an overview of the SQL Server business intelligence (BI) part of the suite that includes Database Engine, SSAS and SSRS, please refer to Veerman E., Lachev T., & Sarka D. (2009). MCTS Self-Paced Training Kit (Exam 70-448): Microsoft® SQL Server® 2008 Business Intelligence Development and Maintenance. MS Press. For an overview of the enterprise information management (EIM) part that includes SSIS, DQS and MDS, please refer to Sarka D., Lah M., & Jerkic G. (2012). Training Kit (Exam 70-463): Implementing a Data Warehouse with Microsoft® SQL Server® 2012. O'Reilly. For details about SSAS data mining, please refer to MacLennan J., Tang Z., & Crivat B. (2009). Data Mining with Microsoft SQL Server 2008. Wiley. SQL Server Data Mining Add-ins for Office, a free download for Office versions 2007, 2010 and 2013, bring the power of data mining to Excel, enabling advanced analytics in Excel. Together with PowerPivot for Excel, which is also freely downloadable and can be used in Excel 2010, is already included in Excel 2013. It brings OLAP functionalities directly into Excel, making it possible for an advanced analyst to build a complete learning infrastructure using a familiar tool. This way, many more people, including employees in subsidiaries, can contribute to the learning process by examining local transactions and quickly identifying new patterns.

    Read the article

  • Oracle Data Integrator at Oracle OpenWorld 2012: Demonstrations

    - by Irem Radzik
    By Mike Eisterer Oracle OpenWorld is just a few days away and  we look forward to showing Oracle Data Integrator' comprehensive data integration platform, which delivers critical data integration requirements: from high-volume, high-performance batch loads, to event-driven, trickle-feed integration processes, to SOA-enabled data services.  Several Oracle Data Integrator demonstrations will be available October 1st through the3rd : Oracle Data Integrator and Oracle GoldenGate for Oracle Applications, in Moscone South, Right - S-240 Oracle Data Integrator and Service Integration, in Moscone South, Right - S-235 Oracle Data Integrator for Big Data, in Moscone South, Right - S-236 Oracle Data Integrator for Enterprise Data Warehousing, in Moscone South, Right - S-238 Additional information about OOW 2012 may be found for the following demonstrations. If you are not able to attend OpenWorld, please check out our latest resources for Data Integration.  

    Read the article

  • Use virtual pageviews for all goal tracking

    - by Jeff Wu
    Hi Pro Webmasters, I'm new to Google Analytics and I'm wondering if it would be cleaner to user virtual pageviews for all the goal tracking on my website instead of using a mix of regular page views and virtual pageviews. I know in most cases this is just semantics but there are multiple pages where the same goal can be achieved and I think it would be cleaner just to fire the same virtual pageview instead of having two different goal pages. Will this model also give developers more flexibility when they do development? I know we are moving to a CMS and urls can get hairy, so I think this might be a good way to make analytics portion of the site "future proof". Any thoughts are appreciated! Thanks.

    Read the article

  • Which algorithms/data structures should I "recognize" and know by name?

    - by Earlz
    I'd like to consider myself a fairly experienced programmer. I've been programming for over 5 years now. My weak point though is terminology. I'm self-taught, so while I know how to program, I don't know some of the more formal aspects of computer science. So, what are practical algorithms/data structures that I could recognize and know by name? Note, I'm not asking for a book recommendation about implementing algorithms. I don't care about implementing them, I just want to be able to recognize when an algorithm/data structure would be a good solution to a problem. I'm asking more for a list of algorithms/data structures that I should "recognize". For instance, I know the solution to a problem like this: You manage a set of lockers labeled 0-999. People come to you to rent the locker and then come back to return the locker key. How would you build a piece of software to manage knowing which lockers are free and which are in used? The solution, would be a queue or stack. What I'm looking for are things like "in what situation should a B-Tree be used -- What search algorithm should be used here" etc. And maybe a quick introduction of how the more complex(but commonly used) data structures/algorithms work. I tried looking at Wikipedia's list of data structures and algorithms but I think that's a bit overkill. So I'm looking more for what are the essential things I should recognize?

    Read the article

  • How does google calculate bounce rate?

    - by Luticka
    Hello We run a website where we sell access to a member area. We use Google analytics to optimize the site but i have a bounce rate in about 85%. The problem is that I'm not sure what count as a bounce and whats not. If a person go to my front page and log in directly. Will this count as a bounce because there is no analytics on the member pages or is Google smart enough to see that the link is on my side? Thanks a lot.

    Read the article

  • Understanding Data Science: Recent Studies

    - by Joe Lamantia
    If you need such a deeper understanding of data science than Drew Conway's popular venn diagram model, or Josh Wills' tongue in cheek characterization, "Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician." two relatively recent studies are worth reading.   'Analyzing the Analyzers,' an O'Reilly e-book by Harlan Harris, Sean Patrick Murphy, and Marck Vaisman, suggests four distinct types of data scientists -- effectively personas, in a design sense -- based on analysis of self-identified skills among practitioners.  The scenario format dramatizes the different personas, making what could be a dry statistical readout of survey data more engaging.  The survey-only nature of the data,  the restriction of scope to just skills, and the suggested models of skill-profiles makes this feel like the sort of exercise that data scientists undertake as an every day task; collecting data, analyzing it using a mix of statistical techniques, and sharing the model that emerges from the data mining exercise.  That's not an indictment, simply an observation about the consistent feel of the effort as a product of data scientists, about data science.  And the paper 'Enterprise Data Analysis and Visualization: An Interview Study' by researchers Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffery Heer considers data science within the larger context of industrial data analysis, examining analytical workflows, skills, and the challenges common to enterprise analysis efforts, and identifying three archetypes of data scientist.  As an interview-based study, the data the researchers collected is richer, and there's correspondingly greater depth in the synthesis.  The scope of the study included a broader set of roles than data scientist (enterprise analysts) and involved questions of workflow and organizational context for analytical efforts in general.  I'd suggest this is useful as a primer on analytical work and workers in enterprise settings for those who need a baseline understanding; it also offers some genuinely interesting nuggets for those already familiar with discovery work. We've undertaken a considerable amount of research into discovery, analytical work/ers, and data science over the past three years -- part of our programmatic approach to laying a foundation for product strategy and highlighting innovation opportunities -- and both studies complement and confirm much of the direct research into data science that we conducted. There were a few important differences in our findings, which I'll share and discuss in upcoming posts.

    Read the article

  • IOMEGA 500GB hard disk data reccovery

    - by Vineeth
    Last year by November I bought an IOMEGA 500GB Prestige hard disk. Yesterday, unfortunately the hard disk fell down from my table. After that incident, when I connect my disk, Windows asks me to format the disk to use, but I didn't format it yet. Actually, on that hard disk I have about 320GB of data. I tried all my possible ways to access my disk. I tried using DOS. It shows "data error (Cyclic redundancy check)". I have a 3 year warranty. Will I be covered under warranty if I report this issue to IOMEGA? Can I get my data back?

    Read the article

  • NRF Big Show 2011 -- Part 2

    - by David Dorf
    One of the things I love about attending NRF is visiting the smaller booths to see what new innovative ideas have sprung up. After all, by watching emerging technologies we can get a sense of how the retail experience might change. After NRF I'm hoping to write a post on what I found, if anything, so be sure to check back. At the Oracle Retail booth we'll be demonstrating some of the aspects of the changing retail experience. These demos use a mix of GA and experimental components. Here are some highlights: 1. Checkin We wrote a consumer iPhone app we call Store Gateway that lets consumers access information from the store. They'll start by doing a checkin when they arrive that will alert the store manager via another iPhone app we wrote called Mobile Manager. Additionally, we display a welcome messaging using Starmount's digital sign. 2. Receive Offers There are three interaction points where a store can easily make an offer to a consumer: checkin, product scans, and checkout. For this demo we're calling our Universal Offer Engine at checkin to determine the best offer for this particular consumer. This offer is then displayed on the consumer's phone as well as on the digital sign. 3. Scan Products To thwart consumers from scanning product barcodes, we used Store Inventory Management to print QRCodes on shelf label then provided access to a scanner in the Store Gateway iphone app. When the consumer scans the shelf label they are shown product information provided by the retailer. 4. Checkout While we don't have a NFC-enabled mobile phone, we have a NFC chip that can attach to a phone. We're using this to checkout using a reader provided by ViVOTech. Tap the phone on the reader, and the POS accesses the customer#, coupons, and payment information. This really speeds the checkout process. 5. Digital Receipt After the transaction is complete, a digital copy of the receipt is sent to Intuit's QuickReceipts where consumers to store all their digital receipts. There's even an iPhone app that provides easy access to the receipts. This covers about half of what what we'll be showing, so be sure to stop by. I'll also be talking about how mobile is impacting the retail experience at the Wednesday morning session NRF Mobile Retail Initiative: a Blueprint for Action. See you at the Big Show!

    Read the article

  • Best stats tool for cross-domain tracking

    - by kidbrax
    We build a webapp that allows users to run the app under their own subdomain. So we run the app under search.domainX.com, search.domainY.com and so on. They each have their own Google Analytics to track individual stats. But we want to know what general traffic for all clients of our app. So we want to know stuff like "among all our clients we had x number of views." What is the best way tool to track that sort of thing. We prefer a snippet based solution similar to Google Analytics if possible.

    Read the article

  • Appending column to a data frame - R

    - by darkie15
    Is it possible to append a column to data frame in the following scenario? dfWithData <- data.frame(start=c(1,2,3), end=c(11,22,33)) dfBlank <- data.frame() ..how to append column start from dfWithData to dfBlank? It looks like the data should be added when data frame is being initialized. I can do this: dfBlank <- data.frame(dfWithData[1]) but I am more interested if it is possible to append columns to an empty (but inti)

    Read the article

  • Why is "www.mysite.com" different from "mysite.com"?

    - by sapeish
    In any browser if I use www.mysite.com or just mysite.com the web page is correctly retrieved, but I am having trouble with Google Analytics and Facebook App. Facebook: To be able to get Likes, I create the Facebook App needed and set the site URL to http://mysite.com/. Using their tool http://developers.facebook.com/tools/debug/ when I test my page using http://mypage.com it works but using http://www.mypage.com fails with the message: Object at URL 'http://www.mysite.com/' of type 'website' is invalid because the domain 'www.mysite.com' is not allowed for the specified application id. Google Analytics: To be able to get traffic statistics, I created a Property and a Profile both with the URL http://www.mypage.com and no statistics were gathered in a week, when I changed the configured URL to http://mypage.com statistics where available a few hours later. What should I do to have statistics and likes for both www.mysite.com and mysite.com ??

    Read the article

  • how to find a good data center?

    - by drewda
    At my start-up, we're getting to the point where we should be hosting our servers at a data center. I'd appreciate any tips and tricks y'all can offer on finding a reputable place to colocate our racks. Are there any Web sites with customer reviews of data centers or should I just be asking around at techie events? Are unlimited bandwidth plans a gimmick or becoming the norm? Is it worth establishing a redundant set of machines at a second data center from Day One? Or just do offsite back-ups? Thanks for your suggestions.

    Read the article

  • Protecting Consolidated Data on Engineered Systems

    - by Steve Enevold
    In this time of reduced budgets and cost cutting measures in Federal, State and Local governments, the requirement to provide services continues to grow. Many agencies are looking at consolidating their infrastructure to reduce cost and meet budget goals. Oracle's engineered systems are ideal platforms for accomplishing these goals. These systems provide unparalleled performance that is ideal for running applications and databases that traditionally run on separate dedicated environments. However, putting multiple critical applications and databases in a single architecture makes security more critical. You are putting a concentrated set of sensitive data on a single system, making it a more tempting target.  The environments were previously separated by iron so now you need to provide assurance that one group, department, or application's information is not visible to other personnel or applications resident in the Exadata system. Administration of the environments requires formal separation of duties so an administrator of one application environment cannot view or negatively impact others. Also, these systems need to be in protected environments just like other critical production servers. They should be in a data center protected by physical controls, network firewalls, intrusion detection and prevention, etc Exadata also provides unique security benefits, including a reducing attack surface by minimizing packages and services to only those required. In addition to reducing the possible system areas someone may attempt to infiltrate, Exadata has the following features: 1.    Infiniband, which functions as a secure private backplane 2.    IPTables  to perform stateful packet inspection for all nodes               Cellwall implements firewall services on each cell using IPTables 3.    Hardware accelerated encryption for data at rest on storage cells Oracle is uniquely positioned to provide the security necessary for implementing Exadata because security has been a core focus since the company's beginning. In addition to the security capabilities inherent in Exadata, Oracle security products are all certified to run in an Exadata environment. Database Vault Oracle Database Vault helps organizations increase the security of existing applications and address regulatory mandates that call for separation-of-duties, least privilege and other preventive controls to ensure data integrity and data privacy. Oracle Database Vault proactively protects application data stored in the Oracle database from being accessed by privileged database users. A unique feature of Database Vault is the ability to segregate administrative tasks including when a command can be executed, or that the DBA can manage the health of the database and objects, but may not see the data Advanced Security  helps organizations comply with privacy and regulatory mandates by transparently encrypting all application data or specific sensitive columns, such as credit cards, social security numbers, or personally identifiable information (PII). By encrypting data at rest and whenever it leaves the database over the network or via backups, Oracle Advanced Security provides the most cost-effective solution for comprehensive data protection. Label Security  is a powerful and easy-to-use tool for classifying data and mediating access to data based on its classification. Designed to meet public-sector requirements for multi-level security and mandatory access control, Oracle Label Security provides a flexible framework that both government and commercial entities worldwide can use to manage access to data on a "need to know" basis in order to protect data privacy and achieve regulatory compliance  Data Masking reduces the threat of someone in the development org taking data that has been copied from production to the development environment for testing, upgrades, etc by irreversibly replacing the original sensitive data with fictitious data so that production data can be shared safely with IT developers or offshore business partners  Audit Vault and Database Firewall Oracle Audit Vault and Database Firewall serves as a critical detective and preventive control across multiple operating systems and database platforms to protect against the abuse of legitimate access to databases responsible for almost all data breaches and cyber attacks.  Consolidation, cost-savings, and performance can now be achieved without sacrificing security. The combination of built in protection and Oracle’s industry-leading data protection solutions make Exadata an ideal platform for Federal, State, and local governments and agencies.

    Read the article

  • Create a filter to consider http://example.com/foo/bar as http://example.com/index.php/foo/bar

    - by magnetik
    I'm using URL rewriting to make my url http://example.com/foo/bar/ to http://example.com/index.php/foo/bar. I'm not linking the index.php/.. url anywhere, but for some reasons, some users arrives to the index.php url. In Google analytics, I have a lot of duplicates that are quite annoying to follow up the traffic. I've watched the Advanced filters but I'm struggling to make it works fine. Any regex and google analytics pro to help me out ?

    Read the article

  • Having a generic data type for a database table column, is it "good" practice?

    - by Yanick Rochon
    I'm working on a PHP project where some object (class member) may contain different data type. For example : class Property { private $_id; // (PK) private $_ref_id; // the object reference id (FK) private $_name; // the name of the property private $_type; // 'string', 'int', 'float(n,m)', 'datetime', etc. private $_data; // ... // ..snip.. public getters/setters } Now, I need to perform some persistence on these objects. Some properties may be a text data type, but nothing bigger than what a varchar may hold. Also, later on, I need to be able to perform searches and sorting. Is it a good practice to use a single database table for this (ie. is there a non negligible performance impact)? If it's "acceptable", then what could be the data type for the data column?

    Read the article

< Previous Page | 45 46 47 48 49 50 51 52 53 54 55 56  | Next Page >