Search Results

Search found 61856 results on 2475 pages for 'data frame'.

Page 2/2475 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Fraud Detection with the SQL Server Suite Part 2

    - by Dejan Sarka
    This is the second part of the fraud detection whitepaper. You can find the first part in my previous blog post about this topic. My Approach to Data Mining Projects It is impossible to evaluate the time and money needed for a complete fraud detection infrastructure in advance. Personally, I do not know the customer’s data in advance. I don’t know whether there is already an existing infrastructure, like a data warehouse, in place, or whether we would need to build one from scratch. Therefore, I always suggest to start with a proof-of-concept (POC) project. A POC takes something between 5 and 10 working days, and involves personnel from the customer’s site – either employees or outsourced consultants. The team should include a subject matter expert (SME) and at least one information technology (IT) expert. The SME must be familiar with both the domain in question as well as the meaning of data at hand, while the IT expert should be familiar with the structure of data, how to access it, and have some programming (preferably Transact-SQL) knowledge. With more than one IT expert the most time consuming work, namely data preparation and overview, can be completed sooner. I assume that the relevant data is already extracted and available at the very beginning of the POC project. If a customer wants to have their people involved in the project directly and requests the transfer of knowledge, the project begins with training. I strongly advise this approach as it offers the establishment of a common background for all people involved, the understanding of how the algorithms work and the understanding of how the results should be interpreted, a way of becoming familiar with the SQL Server suite, and more. Once the data has been extracted, the customer’s SME (i.e. the analyst), and the IT expert assigned to the project will learn how to prepare the data in an efficient manner. Together with me, knowledge and expertise allow us to focus immediately on the most interesting attributes and identify any additional, calculated, ones soon after. By employing our programming knowledge, we can, for example, prepare tens of derived variables, detect outliers, identify the relationships between pairs of input variables, and more, in only two or three days, depending on the quantity and the quality of input data. I favor the customer’s decision of assigning additional personnel to the project. For example, I actually prefer to work with two teams simultaneously. I demonstrate and explain the subject matter by applying techniques directly on the data managed by each team, and then both teams continue to work on the data overview and data preparation under our supervision. I explain to the teams what kind of results we expect, the reasons why they are needed, and how to achieve them. Afterwards we review and explain the results, and continue with new instructions, until we resolve all known problems. Simultaneously with the data preparation the data overview is performed. The logic behind this task is the same – again I show to the teams involved the expected results, how to achieve them and what they mean. This is also done in multiple cycles as is the case with data preparation, because, quite frankly, both tasks are completely interleaved. A specific objective of the data overview is of principal importance – it is represented by a simple star schema and a simple OLAP cube that will first of all simplify data discovery and interpretation of the results, and will also prove useful in the following tasks. The presence of the customer’s SME is the key to resolving possible issues with the actual meaning of the data. We can always replace the IT part of the team with another database developer; however, we cannot conduct this kind of a project without the customer’s SME. After the data preparation and when the data overview is available, we begin the scientific part of the project. I assist the team in developing a variety of models, and in interpreting the results. The results are presented graphically, in an intuitive way. While it is possible to interpret the results on the fly, a much more appropriate alternative is possible if the initial training was also performed, because it allows the customer’s personnel to interpret the results by themselves, with only some guidance from me. The models are evaluated immediately by using several different techniques. One of the techniques includes evaluation over time, where we use an OLAP cube. After evaluating the models, we select the most appropriate model to be deployed for a production test; this allows the team to understand the deployment process. There are many possibilities of deploying data mining models into production; at the POC stage, we select the one that can be completed quickly. Typically, this means that we add the mining model as an additional dimension to an existing DW or OLAP cube, or to the OLAP cube developed during the data overview phase. Finally, we spend some time presenting the results of the POC project to the stakeholders and managers. Even from a POC, the customer will receive lots of benefits, all at the sole risk of spending money and time for a single 5 to 10 day project: The customer learns the basic patterns of frauds and fraud detection The customer learns how to do the entire cycle with their own people, only relying on me for the most complex problems The customer’s analysts learn how to perform much more in-depth analyses than they ever thought possible The customer’s IT experts learn how to perform data extraction and preparation much more efficiently than they did before All of the attendees of this training learn how to use their own creativity to implement further improvements of the process and procedures, even after the solution has been deployed to production The POC output for a smaller company or for a subsidiary of a larger company can actually be considered a finished, production-ready solution It is possible to utilize the results of the POC project at subsidiary level, as a finished POC project for the entire enterprise Typically, the project results in several important “side effects” Improved data quality Improved employee job satisfaction, as they are able to proactively contribute to the central knowledge about fraud patterns in the organization Because eventually more minds get to be involved in the enterprise, the company should expect more and better fraud detection patterns After the POC project is completed as described above, the actual project would not need months of engagement from my side. This is possible due to our preference to transfer the knowledge onto the customer’s employees: typically, the customer will use the results of the POC project for some time, and only engage me again to complete the project, or to ask for additional expertise if the complexity of the problem increases significantly. I usually expect to perform the following tasks: Establish the final infrastructure to measure the efficiency of the deployed models Deploy the models in additional scenarios Through reports By including Data Mining Extensions (DMX) queries in OLTP applications to support real-time early warnings Include data mining models as dimensions in OLAP cubes, if this was not done already during the POC project Create smart ETL applications that divert suspicious data for immediate or later inspection I would also offer to investigate how the outcome could be transferred automatically to the central system; for instance, if the POC project was performed in a subsidiary whereas a central system is available as well Of course, for the actual project, I would repeat the data and model preparation as needed It is virtually impossible to tell in advance how much time the deployment would take, before we decide together with customer what exactly the deployment process should cover. Without considering the deployment part, and with the POC project conducted as suggested above (including the transfer of knowledge), the actual project should still only take additional 5 to 10 days. The approximate timeline for the POC project is, as follows: 1-2 days of training 2-3 days for data preparation and data overview 2 days for creating and evaluating the models 1 day for initial preparation of the continuous learning infrastructure 1 day for presentation of the results and discussion of further actions Quite frequently I receive the following question: are we going to find the best possible model during the POC project, or during the actual project? My answer is always quite simple: I do not know. Maybe, if we would spend just one hour more for data preparation, or create just one more model, we could get better patterns and predictions. However, we simply must stop somewhere, and the best possible way to do this, according to my experience, is to restrict the time spent on the project in advance, after an agreement with the customer. You must also never forget that, because we build the complete learning infrastructure and transfer the knowledge, the customer will be capable of doing further investigations independently and improve the models and predictions over time without the need for a constant engagement with me.

    Read the article

  • Big Data – Beginning Big Data Series Next Month in 21 Parts

    - by Pinal Dave
    Big Data is the next big thing. There was a time when we used to talk in terms of MB and GB of the data. However, the industry is changing and we are now moving to a conversation where we discuss about data in Petabyte, Exabyte and Zettabyte. It seems that the world is now talking about increased Volume of the data. In simple world we all think that Big Data is nothing but plenty of volume. In reality Big Data is much more than just a huge volume of the data. When talking about the data we need to understand about variety and volume along with volume. Though Big data look like a simple concept, it is extremely complex subject when we attempt to start learning the same. My Journey I have recently presented on Big Data in quite a few organizations and I have received quite a few questions during this roadshow event. I have collected all the questions which I have received and decided to post about them on the blog. In the month of October 2013, on every weekday we will be learning something new about Big Data. Every day I will share a concept/question and in the same blog post we will learn the answer of the same. Big Data – Plenty of Questions I received quite a few questions during my road trip. Here are few of the questions. I want to learn Big Data – where should I start? Do I need to know SQL to learn Big Data? What is Hadoop? There are so many organizations talking about Big Data, and every one has a different approach. How to start with big Data? Do I need to know Java to learn about Big Data? What is different between various NoSQL languages. I will attempt to answer most of the questions during the month long series in the next month. Big Data – Big Subject Big Data is a very big subject and I no way claim that I will be covering every single big data concept in this series. However, I promise that I will be indeed sharing lots of basic concepts which are revolving around Big Data. We will discuss from fundamentals about Big Data and continue further learning about it. I will attempt to cover the concept so simple that many of you might have wondered about it but afraid to ask. Your Role! During this series next month, I need your one help. Please keep on posting questions you might have related to big data as blog post comments and on Facebook Page. I will monitor them closely and will try to answer them as well during this series. Now make sure that you do not miss any single blog post in this series as every blog post will be linked to each other. You can subscribe to my feed or like my Facebook page or subscribe via email (by entering email in the blog post). Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Big Data, PostADay, SQL, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Big Data – Role of Cloud Computing in Big Data – Day 11 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the NewSQL. In this article we will understand the role of Cloud in Big Data Story What is Cloud? Cloud is the biggest buzzword around from last few years. Everyone knows about the Cloud and it is extremely well defined online. In this article we will discuss cloud in the context of the Big Data. Cloud computing is a method of providing a shared computing resources to the application which requires dynamic resources. These resources include applications, computing, storage, networking, development and various deployment platforms. The fundamentals of the cloud computing are that it shares pretty much share all the resources and deliver to end users as a service.  Examples of the Cloud Computing and Big Data are Google and Amazon.com. Both have fantastic Big Data offering with the help of the cloud. We will discuss this later in this blog post. There are two different Cloud Deployment Models: 1) The Public Cloud and 2) The Private Cloud Public Cloud Public Cloud is the cloud infrastructure build by commercial providers (Amazon, Rackspace etc.) creates a highly scalable data center that hides the complex infrastructure from the consumer and provides various services. Private Cloud Private Cloud is the cloud infrastructure build by a single organization where they are managing highly scalable data center internally. Here is the quick comparison between Public Cloud and Private Cloud from Wikipedia:   Public Cloud Private Cloud Initial cost Typically zero Typically high Running cost Unpredictable Unpredictable Customization Impossible Possible Privacy No (Host has access to the data Yes Single sign-on Impossible Possible Scaling up Easy while within defined limits Laborious but no limits Hybrid Cloud Hybrid Cloud is the cloud infrastructure build with the composition of two or more clouds like public and private cloud. Hybrid cloud gives best of the both the world as it combines multiple cloud deployment models together. Cloud and Big Data – Common Characteristics There are many characteristics of the Cloud Architecture and Cloud Computing which are also essentially important for Big Data as well. They highly overlap and at many places it just makes sense to use the power of both the architecture and build a highly scalable framework. Here is the list of all the characteristics of cloud computing important in Big Data Scalability Elasticity Ad-hoc Resource Pooling Low Cost to Setup Infastructure Pay on Use or Pay as you Go Highly Available Leading Big Data Cloud Providers There are many players in Big Data Cloud but we will list a few of the known players in this list. Amazon Amazon is arguably the most popular Infrastructure as a Service (IaaS) provider. The history of how Amazon started in this business is very interesting. They started out with a massive infrastructure to support their own business. Gradually they figured out that their own resources are underutilized most of the time. They decided to get the maximum out of the resources they have and hence  they launched their Amazon Elastic Compute Cloud (Amazon EC2) service in 2006. Their products have evolved a lot recently and now it is one of their primary business besides their retail selling. Amazon also offers Big Data services understand Amazon Web Services. Here is the list of the included services: Amazon Elastic MapReduce – It processes very high volumes of data Amazon DynammoDB – It is fully managed NoSQL (Not Only SQL) database service Amazon Simple Storage Services (S3) – A web-scale service designed to store and accommodate any amount of data Amazon High Performance Computing – It provides low-tenancy tuned high performance computing cluster Amazon RedShift – It is petabyte scale data warehousing service Google Though Google is known for Search Engine, we all know that it is much more than that. Google Compute Engine – It offers secure, flexible computing from energy efficient data centers Google Big Query – It allows SQL-like queries to run against large datasets Google Prediction API – It is a cloud based machine learning tool Other Players Besides Amazon and Google we also have other players in the Big Data market as well. Microsoft is also attempting Big Data with the Cloud with Microsoft Azure. Additionally Rackspace and NASA together have initiated OpenStack. The goal of Openstack is to provide a massively scaled, multitenant cloud that can run on any hardware. Thing to Watch The cloud based solutions provides a great integration with the Big Data’s story as well it is very economical to implement as well. However, there are few things one should be very careful when deploying Big Data on cloud solutions. Here is a list of a few things to watch: Data Integrity Initial Cost Recurring Cost Performance Data Access Security Location Compliance Every company have different approaches to Big Data and have different rules and regulations. Based on various factors, one can implement their own custom Big Data solution on a cloud. Tomorrow In tomorrow’s blog post we will discuss about various Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Extract wrong data from a frame in C?

    - by ipkiss
    I am writing a program that reads the data from the serial port on Linux. The data are sent by another device with the following frame format: |start | Command | Data | CRC | End | |0x02 | 0x41 | (0-127 octets) | | 0x03| ---------------------------------------------------- The Data field contains 127 octets as shown and octet 1,2 contains one type of data; octet 3,4 contains another data. I need to get these data. Because in C, one byte can only holds one character and in the start field of the frame, it is 0x02 which means STX which is 3 characters. So, in order to test my program, On the sender side, I construct an array as the frame formatted above like: char frame[254]; frame[0] = 0x02; // starting field frame[1] = 0x41; // command field which is character 'A' ..so on.. And, then On the receiver side, I take out the fields like: char result[254]; // read data read(result); printf("command = %c", result[1]); // get the command field of the frame // get other field's values the command field value (result[1]) is not character 'A'. I think, this because the first field value of the frame is 0x02 (STX) occupying 3 first places in the array frame and leading to the wrong results on the receiver side. How can I correct the issue or am I doing something wrong at the sender side? Thanks all. related questions: http://stackoverflow.com/questions/2500567/parse-and-read-data-frame-in-c http://stackoverflow.com/questions/2531779/clear-data-at-serial-port-in-linux-in-c

    Read the article

  • My rhythm game runs choppy even with high frame rate

    - by felipedrl
    I'm coding a rhythm game and the game runs smoothly with uncapped fps. But when I try to cap it around 60 the game updates in little chunks, like hiccups, as if it was skipping frames or at a very low frame rate. The reason I need to cap frame rate is because in some computers I tested, the fps varies a lot (from ~80 - ~250 fps) and those drops are noticeable and degrade response time. Since this is a rhythm game this is very important. This issue is driving me crazy. I've spent a few weeks already on it and still can't figure out the problem. I hope someone more experienced than me could shed some light on it. I'll try to put here all the hints I've tried along with two pseudo codes for game loops I tried, so I apologize if this post gets too lengthy. 1st GameLoop: const uint UPDATE_SKIP = 1000 / 60; uint nextGameTick = SDL_GetTicks(); while(isNotDone) { // only false when a QUIT event is generated! if (processEvents()) { if (SDL_GetTicks() > nextGameTick) { update(UPDATE_SKIP); render(); nextGameTick += UPDATE_SKIP; } } } 2nd Game Loop: const uint UPDATE_SKIP = 1000 / 60; while (isNotDone) { LARGE_INTEGER startTime; QueryPerformanceCounter(&startTime); // process events will return false in case of a QUIT event processed if (processEvents()) { update(frameTime); render(); } LARGE_INTEGER endTime; do { QueryPerformanceCounter(&endTime); frameTime = static_cast<uint>((endTime.QuadPart - startTime.QuadPart) * 1000.0 / frequency.QuadPart); } while (frameTime < UPDATE_SKIP); } [1] At first I thought it was a timer resolution problem. I was using SDL_GetTicks, but even when I switched to QueryPerformanceCounter, supposedly less granular, I saw no difference. [2] Then I thought it could be due to a rounding error in my position computation and since game updates are smaller in high FPS that would be less noticeable. Indeed there is an small error, but from my tests I realized that it is not enough to produce the position jumps I'm getting. Also, another intriguing factor is that if I enable vsync I'll get smooth updates @60fps regardless frame cap code. So why not rely on vsync? Because some computers can force a disable on gfx card config. [3] I started printing the maximum and minimum frame time measured in 1sec span, in the hope that every a few frames one would take a long time but still not enough to drop my fps computation. It turns out that, with frame cap code I always get frame times in the range of [16, 18]ms, and still, the game "does not moves like jagger". [4] My process' priority is set to HIGH (Windows doesn't allow me to set REALTIME for some reason). As far as I know there is only one thread running along with the game (a sound callback, which I really don't have access to it). I'm using AudiereLib. I then disabled Audiere by removing it from the project and still got the issue. Maybe there are some others threads running and one of them is taking too long to come back right in between when I measured frame times, I don't know. Is there a way to know which threads are attached to my process? [5] There are some dynamic data being created during game run. But It is a little bit hard to remove it to test. Maybe I'll have to try harder this one. Well, as I told you I really don't know what to try next. Anything, I mean, anything would be of great help. What bugs me more is why at 60fps & vsync enabled I get an smooth update and at 60fps & no vsync I don't. Is there a way to implement software vsync? I mean, query display sync info? Thanks in advance. I appreciate the ones that got this far and yet again I apologize for the long post. Best Regards from a fellow coder.

    Read the article

  • Big Data – Buzz Words: Importance of Relational Database in Big Data World – Day 9 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned what is HDFS. In this article we will take a quick look at the importance of the Relational Database in Big Data world. A Big Question? Here are a few questions I often received since the beginning of the Big Data Series - Does the relational database have no space in the story of the Big Data? Does relational database is no longer relevant as Big Data is evolving? Is relational database not capable to handle Big Data? Is it true that one no longer has to learn about relational data if Big Data is the final destination? Well, every single time when I hear that one person wants to learn about Big Data and is no longer interested in learning about relational database, I find it as a bit far stretched. I am not here to give ambiguous answers of It Depends. I am personally very clear that one who is aspiring to become Big Data Scientist or Big Data Expert they should learn about relational database. NoSQL Movement The reason for the NoSQL Movement in recent time was because of the two important advantages of the NoSQL databases. Performance Flexible Schema In personal experience I have found that when I use NoSQL I have found both of the above listed advantages when I use NoSQL database. There are instances when I found relational database too much restrictive when my data is unstructured as well as they have in the datatype which my Relational Database does not support. It is the same case when I have found that NoSQL solution performing much better than relational databases. I must say that I am a big fan of NoSQL solutions in the recent times but I have also seen occasions and situations where relational database is still perfect fit even though the database is growing increasingly as well have all the symptoms of the big data. Situations in Relational Database Outperforms Adhoc reporting is the one of the most common scenarios where NoSQL is does not have optimal solution. For example reporting queries often needs to aggregate based on the columns which are not indexed as well are built while the report is running, in this kind of scenario NoSQL databases (document database stores, distributed key value stores) database often does not perform well. In the case of the ad-hoc reporting I have often found it is much easier to work with relational databases. SQL is the most popular computer language of all the time. I have been using it for almost over 10 years and I feel that I will be using it for a long time in future. There are plenty of the tools, connectors and awareness of the SQL language in the industry. Pretty much every programming language has a written drivers for the SQL language and most of the developers have learned this language during their school/college time. In many cases, writing query based on SQL is much easier than writing queries in NoSQL supported languages. I believe this is the current situation but in the future this situation can reverse when No SQL query languages are equally popular. ACID (Atomicity Consistency Isolation Durability) – Not all the NoSQL solutions offers ACID compliant language. There are always situations (for example banking transactions, eCommerce shopping carts etc.) where if there is no ACID the operations can be invalid as well database integrity can be at risk. Even though the data volume indeed qualify as a Big Data there are always operations in the application which absolutely needs ACID compliance matured language. The Mixed Bag I have often heard argument that all the big social media sites now a days have moved away from Relational Database. Actually this is not entirely true. While researching about Big Data and Relational Database, I have found that many of the popular social media sites uses Big Data solutions along with Relational Database. Many are using relational databases to deliver the results to end user on the run time and many still uses a relational database as their major backbone. Here are a few examples: Facebook uses MySQL to display the timeline. (Reference Link) Twitter uses MySQL. (Reference Link) Tumblr uses Sharded MySQL (Reference Link) Wikipedia uses MySQL for data storage. (Reference Link) There are many for prominent organizations which are running large scale applications uses relational database along with various Big Data frameworks to satisfy their various business needs. Summary I believe that RDBMS is like a vanilla ice cream. Everybody loves it and everybody has it. NoSQL and other solutions are like chocolate ice cream or custom ice cream – there is a huge base which loves them and wants them but not every ice cream maker can make it just right  for everyone’s taste. No matter how fancy an ice cream store is there is always plain vanilla ice cream available there. Just like the same, there are always cases and situations in the Big Data’s story where traditional relational database is the part of the whole story. In the real world scenarios there will be always the case when there will be need of the relational database concepts and its ideology. It is extremely important to accept relational database as one of the key components of the Big Data instead of treating it as a substandard technology. Ray of Hope – NewSQL In this module we discussed that there are places where we need ACID compliance from our Big Data application and NoSQL will not support that out of box. There is a new termed coined for the application/tool which supports most of the properties of the traditional RDBMS and supports Big Data infrastructure – NewSQL. Tomorrow In tomorrow’s blog post we will discuss about NewSQL. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • To sample or not to sample...

    - by [email protected]
    Ideally, we would know the exact answer to every question. How many people support presidential candidate A vs. B? How many people suffer from H1N1 in a given state? Does this batch of manufactured widgets have any defective parts? Knowing exact answers is expensive in terms of time and money and, in most cases, is impractical if not impossible. Consider asking every person in a region for their candidate preference, testing every person with flu symptoms for H1N1 (assuming every person reported when they had flu symptoms), or destructively testing widgets to determine if they are "good" (leaving no product to sell). Knowing exact answers, fortunately, isn't necessary or even useful in many situations. Understanding the direction of a trend or statistically significant results may be sufficient to answer the underlying question: who is likely to win the election, have we likely reached a critical threshold for flu, or is this batch of widgets good enough to ship? Statistics help us to answer these questions with a certain degree of confidence. This focuses on how we collect data. In data mining, we focus on the use of data, that is data that has already been collected. In some cases, we may have all the data (all purchases made by all customers), in others the data may have been collected using sampling (voters, their demographics and candidate choice). Building data mining models on all of your data can be expensive in terms of time and hardware resources. Consider a company with 40 million customers. Do we need to mine all 40 million customers to get useful data mining models? The quality of models built on all data may be no better than models built on a relatively small sample. Determining how much is a reasonable amount of data involves experimentation. When starting the model building process on large datasets, it is often more efficient to begin with a small sample, perhaps 1000 - 10,000 cases (records) depending on the algorithm, source data, and hardware. This allows you to see quickly what issues might arise with choice of algorithm, algorithm settings, data quality, and need for further data preparation. Instead of waiting for a model on a large dataset to build only to find that the results don't meet expectations, once you are satisfied with the results on the initial sample, you can  take a larger sample to see if model quality improves, and to get a sense of how the algorithm scales to the particular dataset. If model accuracy or quality continues to improve, consider increasing the sample size. Sampling in data mining is also used to produce a held-aside or test dataset for assessing classification and regression model accuracy. Here, we reserve some of the build data (data that includes known target values) to be used for an honest estimate of model error using data the model has not seen before. This sampling transformation is often called a split because the build data is split into two randomly selected sets, often with 60% of the records being used for model building and 40% for testing. Sampling must be performed with care, as it can adversely affect model quality and usability. Even a truly random sample doesn't guarantee that all values are represented in a given attribute. This is particularly troublesome when the attribute with omitted values is the target. A predictive model that has not seen any examples for a particular target value can never predict that target value! For other attributes, values may consist of a single value (a constant attribute) or all unique values (an identifier attribute), each of which may be excluded during mining. Values from categorical predictor attributes that didn't appear in the training data are not used when testing or scoring datasets. In subsequent posts, we'll talk about three sampling techniques using Oracle Database: simple random sampling without replacement, stratified sampling, and simple random sampling with replacement.

    Read the article

  • Kipróbálható az ingyenes új Oracle Data Miner 11gR2 grafikus workflow-val

    - by Fekete Zoltán
    Oracle Data Mining technológiai információs oldal. Oracle Data Miner 11g Release 2 - Early Adopter oldal. Megjelent, letöltheto és kipróbálható az Oracle Data Mining, az Oracle adatbányászat új grafikus felülete, az Oracle Data Miner 11gR2. Az Oracle Data Minerhez egyszeruen az SQL Developer-t kell letöltenünk, mivel az adatbányászati felület abból indítható. Az Oracle Data Mining az Oracle adatbáziskezelobe ágyazott adatbányászati motor, ami az Oracle Database Enterprise Edition opciója. Az adatbányászat az adattárházak elemzésének kifinomult eszköze és folyamata. Az Oracle Data Mining in-database-mining elonyeit felvonultatja: - nincs felesleges adatmozgatás, a teljes adatbányászati folyamatban az adatbázisban maradnak az adatok - az adatbányászati modellek is az Oracle adatbázisban vannak - az adatbányászati eredmények, cluster adatok, döntések, valószínuségek, stb. szintén az adatbázisban keletkeznek, és ott közvetlenül elemezhetoek Az új ingyenes Data Miner felület "hatalmas gazdagodáson" ment keresztül az elozo verzióhoz képest. - grafikus adatbányászati workflow szerkesztés és futtatás jelent meg! - továbbra is ingyenes - kibovült a felület - új elemzési lehetoségekkel bovült - az SQL Developer 3.0 felületrol indítható, ez megkönnyíti az adatbányászati funkciók meghívását az adatbázisból, ha épp nem a grafikus felületetet szeretnénk erre használni Az ingyenes Data Miner felület az Oracle SQL Developer kiterjesztéseként érheto el, így az elemzok közvetlenül dolgozhatnak az adatokkal az adatbázisban és a Data Miner grafikus felülettel is, építhetnek és kiértékelhetnek, futtathatnak modelleket, predikciókat tehetnek és elemezhetnek, támogatást kapva az adatbányászati módszertan megvalósítására. A korábbi Oracle Data Miner felület a Data Miner Classic néven fut és továbbra is letöltheto az OTN-rol. Az új Data Miner GUI-ból egy képernyokép: Milyen feladatokra ad megoldási lehetoséget az Oracle Data Mining: - ügyfél viselkedés megjövendölése, prediktálása - a "legjobb" ügyfelek eredményes megcélzása - ügyfél megtartás, elvándorlás kezelés (churn) - ügyfél szegmensek, klaszterek, profilok keresése és vizsgálata - anomáliák, visszaélések felderítése - stb.

    Read the article

  • Rotation of bitmap using a frame by frame animation

    - by pengume
    Hey every one I know this has probably been asked a ton of times but I just wanted to clarify if I am approaching this correctly, since I ran into some problems rotating a bitmap. So basically I have one large bitmap that has four frames drawn on it and I only draw one at a time by looping through the bitmap by increments to animate walking. I can get the bitmap to rotate correctly when it is not moving but once the animation starts it starts to cut off alot of the image and sometimes becomes very fuzzy. I have tried public void draw(Canvas canvas,int pointerX, int pointerY) { Matrix m; if (setRotation){ // canvas.save(); m = new Matrix(); m.reset(); // spriteWidth and spriteHeight are for just the current frame showed m.setTranslate(spriteWidth / 2, spriteHeight / 2); //get and set rotation for ninja based off of joystick m.preRotate((float) GameControls.getRotation()); //create the rotated bitmap flipedSprite = Bitmap.createBitmap(bitmap , 0, 0,bitmap.getWidth(),bitmap.getHeight() , m, true); //set new bitmap to rotated ninja setBitmap(flipedSprite); // canvas.restore(); Log.d("Ninja View", "angle of rotation= " +(float) GameControls.getRotation()); setRotation = false; } And then the Draw Method here // create the destination rectangle for the ninjas current animation frame // pointerX and pointerY are from the joystick moving the ninja around destRect = new Rect(pointerX, pointerY, pointerX + spriteWidth, pointerY + spriteHeight); canvas.drawBitmap(bitmap, getSourceRect(), destRect, null); The animation is four frames long and gets incremented by 66 (the size of one of the frames on the bitmap) for every frame and then back to 0 at the end of the loop.

    Read the article

  • If statement causing xna sprites to draw frame by frame

    - by user1489599
    I’m a bit new to XNA but I wanted to write a simple program that would fire a cannon ball from a cannon at a 45 degree angle. It works fine outside of my keyboard i/o if statement, but when I encapsulate the code around an if statement checking to see if the user hits the space bar, the sprite will draw one frame at a time every time the space bar is hit. This is the code in question if (currentKeyboardState.IsKeyUp(Keys.Space) && previousKeyboardState.IsKeyDown(Keys.Space) && !skullBall.Alive) { //works outside the keyboard input if statement //{ skullBall.Position = cannon.Position; skullBall.DeltaY = -(float)(Math.Sin(MathHelper.ToRadians(45)) * 50/*39.7577*/ * time + 0.5 * (gravity * (time * time))); skullBall.DeltaX = (float)(Math.Cos(MathHelper.ToRadians(45)) * 50/*39.7577*/ * time); skullBall.Alive = true; //} } The skull ball represents the cannon ball and the cannon is just the starting point. DeltaX and DeltaY are the values I’m using to update the cannon balls position per update. I know it's dumb to have the cannon ball start at the cannons position every time the update is called but it’s not really noticeable right now. I was wondering if after examining my code, if anyone noticed any errors that would cause the sprite to display frame by frame instead of drawing it as a full animation of the cannon ball leaving the cannon and moving from there.

    Read the article

  • Big Data – Operational Databases Supporting Big Data – RDBMS and NoSQL – Day 12 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the Cloud in the Big Data Story. In this article we will understand the role of Operational Databases Supporting Big Data Story. Even though we keep on talking about Big Data architecture, it is extremely crucial to understand that Big Data system can’t just exist in the isolation of itself. There are many needs of the business can only be fully filled with the help of the operational databases. Just having a system which can analysis big data may not solve every single data problem. Real World Example Think about this way, you are using Facebook and you have just updated your information about the current relationship status. In the next few seconds the same information is also reflected in the timeline of your partner as well as a few of the immediate friends. After a while you will notice that the same information is now also available to your remote friends. Later on when someone searches for all the relationship changes with their friends your change of the relationship will also show up in the same list. Now here is the question – do you think Big Data architecture is doing every single of these changes? Do you think that the immediate reflection of your relationship changes with your family member is also because of the technology used in Big Data. Actually the answer is Facebook uses MySQL to do various updates in the timeline as well as various events we do on their homepage. It is really difficult to part from the operational databases in any real world business. Now we will see a few of the examples of the operational databases. Relational Databases (This blog post) NoSQL Databases (This blog post) Key-Value Pair Databases (Tomorrow’s post) Document Databases (Tomorrow’s post) Columnar Databases (The Day After’s post) Graph Databases (The Day After’s post) Spatial Databases (The Day After’s post) Relational Databases We have earlier discussed about the RDBMS role in the Big Data’s story in detail so we will not cover it extensively over here. Relational Database is pretty much everywhere in most of the businesses which are here for many years. The importance and existence of the relational database are always going to be there as long as there are meaningful structured data around. There are many different kinds of relational databases for example Oracle, SQL Server, MySQL and many others. If you are looking for Open Source and widely accepted database, I suggest to try MySQL as that has been very popular in the last few years. I also suggest you to try out PostgreSQL as well. Besides many other essential qualities PostgreeSQL have very interesting licensing policies. PostgreSQL licenses allow modifications and distribution of the application in open or closed (source) form. One can make any modifications and can keep it private as well as well contribute to the community. I believe this one quality makes it much more interesting to use as well it will play very important role in future. Nonrelational Databases (NOSQL) We have also covered Nonrelational Dabases in earlier blog posts. NoSQL actually stands for Not Only SQL Databases. There are plenty of NoSQL databases out in the market and selecting the right one is always very challenging. Here are few of the properties which are very essential to consider when selecting the right NoSQL database for operational purpose. Data and Query Model Persistence of Data and Design Eventual Consistency Scalability Though above all of the properties are interesting to have in any NoSQL database but the one which most attracts to me is Eventual Consistency. Eventual Consistency RDBMS uses ACID (Atomicity, Consistency, Isolation, Durability) as a key mechanism for ensuring the data consistency, whereas NonRelational DBMS uses BASE for the same purpose. Base stands for Basically Available, Soft state and Eventual consistency. Eventual consistency is widely deployed in distributed systems. It is a consistency model used in distributed computing which expects unexpected often. In large distributed system, there are always various nodes joining and various nodes being removed as they are often using commodity servers. This happens either intentionally or accidentally. Even though one or more nodes are down, it is expected that entire system still functions normally. Applications should be able to do various updates as well as retrieval of the data successfully without any issue. Additionally, this also means that system is expected to return the same updated data anytime from all the functioning nodes. Irrespective of when any node is joining the system, if it is marked to hold some data it should contain the same updated data eventually. As per Wikipedia - Eventual consistency is a consistency model used in distributed computing that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. In other words -  Informally, if no additional updates are made to a given data item, all reads to that item will eventually return the same value. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • timetable in a jTable

    - by chandra
    I want to create a timetable in a jTable. For the top row it will display from monday to sunday and the left colume will display the time of the day with 2h interval e.g 1st colume (0000 - 0200), 2nd colume (0200 - 0400) .... And if i click a button the timing will change from 2h interval to 1h interval. I do not want to hardcode it because i need to do for 2h, 1h, 30min , 15min, 1min, 30sec and 1 sec interval and it will take too long for me to hardcode. Can anyone show me an example or help me create an example for the 2h to 1h interval so that i know what to do? The data array is for me to store data and are there any other easier or shortcuts for me to store them because if it is in 1 sec interval i got thousands of array i need to type it out. private void oneHour() //1 interval functions { if(!once) { initialize(); once = true; } jTable.setModel(new javax.swing.table.DefaultTableModel( new Object [][] { {"0000 - 0100", data[0][0], data[0][1], data[0][2], data[0][3], data[0][4], data[0][5], data[0][6]}, {"0100 - 0200", data[2][0], data[2][1], data[2][2], data[2][3], data[2][4], data[2][5], data[2][6]}, {"0200 - 0300", data[4][0], data[4][1], data[4][2], data[4][3], data[4][4], data[4][5], data[4][6]}, {"0300 - 0400", data[6][0], data[6][1], data[6][2], data[6][3], data[6][4], data[6][5], data[6][6]}, {"0400 - 0600", data[8][0], data[8][1], data[8][2], data[8][3], data[8][4], data[8][5], data[8][6]}, {"0600 - 0700", data[10][0], data[4][1], data[10][2], data[10][3], data[10][4], data[10][5], data[10][6]}, {"0700 - 0800", data[12][0], data[12][1], data[12][2], data[12][3], data[12][4], data[12][5], data[12][6]}, {"0800 - 0900", data[14][0], data[14][1], data[14][2], data[14][3], data[14][4], data[14][5], data[14][6]}, {"0900 - 1000", data[16][0], data[16][1], data[16][2], data[16][3], data[16][4], data[16][5], data[16][6]}, {"1000 - 1100", data[18][0], data[18][1], data[18][2], data[18][3], data[18][4], data[18][5], data[18][6]}, {"1100 - 1200", data[20][0], data[20][1], data[20][2], data[20][3], data[20][4], data[20][5], data[20][6]}, {"1200 - 1300", data[22][0], data[22][1], data[22][2], data[22][3], data[22][4], data[22][5], data[22][6]}, {"1300 - 1400", data[24][0], data[24][1], data[24][2], data[24][3], data[24][4], data[24][5], data[24][6]}, {"1400 - 1500", data[26][0], data[26][1], data[26][2], data[26][3], data[26][4], data[26][5], data[26][6]}, {"1500 - 1600", data[28][0], data[28][1], data[28][2], data[28][3], data[28][4], data[28][5], data[28][6]}, {"1600 - 1700", data[30][0], data[30][1], data[30][2], data[30][3], data[30][4], data[30][5], data[30][6]}, {"1700 - 1800", data[32][0], data[32][1], data[32][2], data[32][3], data[32][4], data[32][5], data[32][6]}, {"1800 - 1900", data[34][0], data[34][1], data[34][2], data[34][3], data[34][4], data[34][5], data[34][6]}, {"1900 - 2000", data[36][0], data[36][1], data[36][2], data[36][3], data[36][4], data[36][5], data[36][6]}, {"2000 - 2100", data[38][0], data[38][1], data[38][2], data[38][3], data[38][4], data[38][5], data[38][6]}, {"2100 - 2200", data[40][0], data[40][1], data[40][2], data[40][3], data[40][4], data[40][5], data[40][6]}, {"2200 - 2300", data[42][0], data[42][1], data[42][2], data[42][3], data[42][4], data[42][5], data[42][6]}, {"2300 - 2400", data[44][0], data[44][1], data[44][2], data[44][3], data[44][4], data[44][5], data[44][6]}, {"2400 - 0000", data[46][0], data[46][1], data[46][2], data[46][3], data[46][4], data[46][5], data[46][6]}, }, new String [] { "Time/Day", "(Mon)", "(Tue)", "(Wed)", "(Thurs)", "(Fri)", "(Sat)", "(Sun)" } )); } private void twoHour() //2 hour interval functions { if(!once) { initialize(); once = true; } jTable.setModel(new javax.swing.table.DefaultTableModel( new Object [][] { {"0000 - 0200", data[0][0], data[0][1], data[0][2], data[0][3], data[0][4], data[0][5], data[0][6]}, {"0200 - 0400", data[4][0], data[4][1], data[4][2], data[4][3], data[4][4], data[4][5], data[4][6]}, {"0400 - 0600", data[8][0], data[8][1], data[8][2], data[8][3], data[8][4], data[8][5], data[8][6]}, {"0600 - 0800", data[12][0], data[12][1], data[12][2], data[12][3], data[12][4], data[12][5], data[12][6]}, {"0800 - 1000", data[16][0], data[16][1], data[16][2], data[16][3], data[16][4], data[16][5], data[16][6]}, {"1000 - 1200", data[20][0], data[20][1], data[20][2], data[20][3], data[20][4], data[20][5], data[20][6]}, {"1200 - 1400", data[24][0], data[24][1], data[24][2], data[24][3], data[24][4], data[24][5], data[24][6]}, {"1400 - 1600", data[28][0], data[28][1], data[28][2], data[28][3], data[28][4], data[28][5], data[28][6]}, {"1600 - 1800", data[32][0], data[32][1], data[32][2], data[32][3], data[32][4], data[32][5], data[32][6]}, {"1800 - 2000", data[36][0], data[36][1], data[36][2], data[36][3], data[36][4], data[36][5], data[36][6]}, {"2000 - 2200", data[40][0], data[40][1], data[40][2], data[40][3], data[40][4], data[40][5], data[40][6]}, {"2200 - 2400",data[44][0], data[44][1], data[44][2], data[44][3], data[44][4], data[44][5], data[44][6]} },

    Read the article

  • ffmpeg - How to determine if -movflags faststart is enabled? PHP

    - by IIIOXIII
    While I am able to encode an mp4 file which I can plan on my local windows machine, I am having trouble encoding files to mp4 which are readable when streaming by safari, etc. After a bit of reading, I believe my issue is that I must move the metadata from the end of the file to the beginning in order for the converted mp4 files to be streamable. To that end, I am trying to find out if the build of ffmpeg that I am currently using is able to use the -movflags faststart option through php - as my current outputted mp4 files are not working when streamed online. This is the way I am now echoing the -help, -formats, -codecs, but I am not seeing anything about -movflags faststart in any of the lists: exec($ffmpegPath." -help", $codecArr); for($ii=0;$ii<count($codecArr);$ii++){ echo $codecArr[$ii].'</br>'; } Is there a similar method of determining if -movflags fastart is available to my ffmpeg build? Any other way? Should it be listed with any of the previously suggested commands? -help/-formats? Can someone that knows it is enabled in their version of ffmpeg check to see if it is listed under -help or -formats, etc.? TIA. EDIT: COMPLETE CONSOLE OUTPUT FOR BOTH THE CONVERSION COMMAND AND -MOVFLAGS COMMAND BELOW: COMMAND: ffmpeg_new -i C:\vidtests\Wildlife.wmv -s 640x480 C:\vidtests\Wildlife.mp4 OUTPUT: ffmpeg version N-54207-ge59fb3f Copyright (c) 2000-2013 the FFmpeg developers built on Jun 25 2013 21:55:00 with gcc 4.7.3 (GCC) configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-av isynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enab le-iconv --enable-libass --enable-libbluray --enable-libcaca --enable-libfreetyp e --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --ena ble-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-l ibopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libsp eex --enable-libtheora --enable-libtwolame --enable-libvo-aacenc --enable-libvo- amrwbenc --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libxavs -- enable-libxvid --enable-zlib libavutil 52. 37.101 / 52. 37.101 libavcodec 55. 17.100 / 55. 17.100 libavformat 55. 10.100 / 55. 10.100 libavdevice 55. 2.100 / 55. 2.100 libavfilter 3. 77.101 / 3. 77.101 libswscale 2. 3.100 / 2. 3.100 libswresample 0. 17.102 / 0. 17.102 libpostproc 52. 3.100 / 52. 3.100 [asf @ 00000000002ed760] Stream #0: not enough frames to estimate rate; consider increasing probesize Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, asf, from 'C:\vidtests\Wildlife.wmv' : Metadata: SfOriginalFPS : 299700 WMFSDKVersion : 11.0.6001.7000 WMFSDKNeeded : 0.0.0.0000 comment : Footage: Small World Productions, Inc; Tourism New Zealand | Producer: Gary F. Spradling | Music: Steve Ball title : Wildlife in HD copyright : -¬ 2008 Microsoft Corporation IsVBR : 0 DeviceConformanceTemplate: AP@L3 Duration: 00:00:30.09, start: 0.000000, bitrate: 6977 kb/s Stream #0:0(eng): Audio: wmav2 (a[1][0][0] / 0x0161), 44100 Hz, stereo, fltp , 192 kb/s Stream #0:1(eng): Video: vc1 (Advanced) (WVC1 / 0x31435657), yuv420p, 1280x7 20, 5942 kb/s, 29.97 tbr, 1k tbn, 1k tbc [libx264 @ 00000000002e6980] using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 [libx264 @ 00000000002e6980] profile High, level 3.0 [libx264 @ 00000000002e6980] 264 - core 133 r2334 a3ac64b - H.264/MPEG-4 AVC cod ec - Copyleft 2003-2013 - http://www.videolan.org/x264.html - options: cabac=1 r ef=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed _ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pski p=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 deci mate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_ adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=2 5 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.6 0 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, mp4, to 'C:\vidtests\Wildlife.mp4': Metadata: SfOriginalFPS : 299700 WMFSDKVersion : 11.0.6001.7000 WMFSDKNeeded : 0.0.0.0000 comment : Footage: Small World Productions, Inc; Tourism New Zealand | Producer: Gary F. Spradling | Music: Steve Ball title : Wildlife in HD copyright : -¬ 2008 Microsoft Corporation IsVBR : 0 DeviceConformanceTemplate: AP@L3 encoder : Lavf55.10.100 Stream #0:0(eng): Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 6 40x480, q=-1--1, 30k tbn, 29.97 tbc Stream #0:1(eng): Audio: aac (libvo_aacenc) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, s16, 128 kb/s Stream mapping: Stream #0:1 -> #0:0 (vc1 -> libx264) Stream #0:0 -> #0:1 (wmav2 -> libvo_aacenc) Press [q] to stop, [?] for help frame= 53 fps= 49 q=29.0 size= 0kB time=00:00:00.13 bitrate= 2.9kbits/ frame= 63 fps= 40 q=29.0 size= 0kB time=00:00:00.46 bitrate= 0.8kbits/ frame= 74 fps= 35 q=29.0 size= 0kB time=00:00:00.83 bitrate= 0.5kbits/ frame= 85 fps= 32 q=29.0 size= 0kB time=00:00:01.20 bitrate= 0.3kbits/ frame= 95 fps= 30 q=29.0 size= 0kB time=00:00:01.53 bitrate= 0.3kbits/ frame= 107 fps= 28 q=29.0 size= 0kB time=00:00:01.93 bitrate= 0.2kbits/ Queue input is backward in time [mp4 @ 00000000003ef800] Non-monotonous DTS in output stream 0:1; previous: 7616 , current: 7063; changing to 7617. This may result in incorrect timestamps in th e output file. frame= 118 fps= 28 q=29.0 size= 113kB time=00:00:02.30 bitrate= 402.6kbits/ frame= 129 fps= 26 q=29.0 size= 219kB time=00:00:02.66 bitrate= 670.7kbits/ frame= 141 fps= 26 q=29.0 size= 264kB time=00:00:03.06 bitrate= 704.2kbits/ frame= 152 fps= 25 q=29.0 size= 328kB time=00:00:03.43 bitrate= 782.2kbits/ frame= 163 fps= 25 q=29.0 size= 431kB time=00:00:03.80 bitrate= 928.1kbits/ frame= 174 fps= 24 q=29.0 size= 568kB time=00:00:04.17 bitrate=1116.3kbits/ frame= 190 fps= 25 q=29.0 size= 781kB time=00:00:04.70 bitrate=1359.9kbits/ frame= 204 fps= 25 q=29.0 size= 1006kB time=00:00:05.17 bitrate=1593.1kbits/ frame= 218 fps= 25 q=29.0 size= 1058kB time=00:00:05.63 bitrate=1536.8kbits/ frame= 229 fps= 25 q=29.0 size= 1093kB time=00:00:06.00 bitrate=1490.9kbits/ frame= 239 fps= 24 q=29.0 size= 1118kB time=00:00:06.33 bitrate=1444.4kbits/ frame= 251 fps= 24 q=29.0 size= 1150kB time=00:00:06.74 bitrate=1397.9kbits/ frame= 265 fps= 24 q=29.0 size= 1234kB time=00:00:07.20 bitrate=1402.3kbits/ frame= 278 fps= 25 q=29.0 size= 1332kB time=00:00:07.64 bitrate=1428.3kbits/ frame= 294 fps= 25 q=29.0 size= 1403kB time=00:00:08.17 bitrate=1405.7kbits/ frame= 308 fps= 25 q=29.0 size= 1547kB time=00:00:08.64 bitrate=1466.4kbits/ frame= 323 fps= 25 q=29.0 size= 1595kB time=00:00:09.14 bitrate=1429.5kbits/ frame= 337 fps= 25 q=29.0 size= 1702kB time=00:00:09.60 bitrate=1450.7kbits/ frame= 351 fps= 25 q=29.0 size= 1755kB time=00:00:10.07 bitrate=1427.1kbits/ frame= 365 fps= 25 q=29.0 size= 1820kB time=00:00:10.54 bitrate=1414.1kbits/ frame= 381 fps= 25 q=29.0 size= 1852kB time=00:00:11.07 bitrate=1369.6kbits/ frame= 396 fps= 26 q=29.0 size= 1893kB time=00:00:11.57 bitrate=1339.5kbits/ frame= 409 fps= 26 q=29.0 size= 1923kB time=00:00:12.01 bitrate=1311.8kbits/ frame= 421 fps= 25 q=29.0 size= 1967kB time=00:00:12.41 bitrate=1298.3kbits/ frame= 434 fps= 25 q=29.0 size= 1998kB time=00:00:12.84 bitrate=1274.0kbits/ frame= 445 fps= 25 q=29.0 size= 2018kB time=00:00:13.21 bitrate=1251.3kbits/ frame= 458 fps= 25 q=29.0 size= 2048kB time=00:00:13.64 bitrate=1229.5kbits/ frame= 471 fps= 25 q=29.0 size= 2067kB time=00:00:14.08 bitrate=1202.3kbits/ frame= 484 fps= 25 q=29.0 size= 2189kB time=00:00:14.51 bitrate=1235.5kbits/ frame= 497 fps= 25 q=29.0 size= 2260kB time=00:00:14.94 bitrate=1238.3kbits/ frame= 509 fps= 25 q=29.0 size= 2311kB time=00:00:15.34 bitrate=1233.3kbits/ frame= 523 fps= 25 q=29.0 size= 2429kB time=00:00:15.81 bitrate=1258.1kbits/ frame= 535 fps= 25 q=29.0 size= 2541kB time=00:00:16.21 bitrate=1283.5kbits/ frame= 548 fps= 25 q=29.0 size= 2718kB time=00:00:16.64 bitrate=1337.5kbits/ frame= 560 fps= 25 q=29.0 size= 2845kB time=00:00:17.05 bitrate=1367.1kbits/ frame= 571 fps= 25 q=29.0 size= 2965kB time=00:00:17.41 bitrate=1394.6kbits/ frame= 580 fps= 25 q=29.0 size= 3025kB time=00:00:17.71 bitrate=1398.7kbits/ frame= 588 fps= 25 q=29.0 size= 3098kB time=00:00:17.98 bitrate=1411.1kbits/ frame= 597 fps= 25 q=29.0 size= 3183kB time=00:00:18.28 bitrate=1426.1kbits/ frame= 606 fps= 24 q=29.0 size= 3279kB time=00:00:18.58 bitrate=1445.2kbits/ frame= 616 fps= 24 q=29.0 size= 3441kB time=00:00:18.91 bitrate=1489.9kbits/ frame= 626 fps= 24 q=29.0 size= 3650kB time=00:00:19.25 bitrate=1553.0kbits/ frame= 638 fps= 24 q=29.0 size= 3826kB time=00:00:19.65 bitrate=1594.7kbits/ frame= 649 fps= 24 q=29.0 size= 3950kB time=00:00:20.02 bitrate=1616.3kbits/ frame= 660 fps= 24 q=29.0 size= 4067kB time=00:00:20.38 bitrate=1634.1kbits/ frame= 669 fps= 24 q=29.0 size= 4121kB time=00:00:20.68 bitrate=1631.8kbits/ frame= 682 fps= 24 q=29.0 size= 4274kB time=00:00:21.12 bitrate=1657.9kbits/ frame= 696 fps= 24 q=29.0 size= 4446kB time=00:00:21.58 bitrate=1687.1kbits/ frame= 709 fps= 24 q=29.0 size= 4590kB time=00:00:22.02 bitrate=1707.3kbits/ frame= 719 fps= 24 q=29.0 size= 4772kB time=00:00:22.35 bitrate=1748.5kbits/ frame= 732 fps= 24 q=29.0 size= 4852kB time=00:00:22.78 bitrate=1744.3kbits/ frame= 744 fps= 24 q=29.0 size= 4973kB time=00:00:23.18 bitrate=1756.9kbits/ frame= 756 fps= 24 q=29.0 size= 5099kB time=00:00:23.59 bitrate=1770.8kbits/ frame= 768 fps= 24 q=29.0 size= 5149kB time=00:00:23.99 bitrate=1758.4kbits/ frame= 780 fps= 24 q=29.0 size= 5227kB time=00:00:24.39 bitrate=1755.7kbits/ frame= 797 fps= 24 q=29.0 size= 5377kB time=00:00:24.95 bitrate=1765.0kbits/ frame= 813 fps= 24 q=29.0 size= 5507kB time=00:00:25.49 bitrate=1769.5kbits/ frame= 828 fps= 24 q=29.0 size= 5634kB time=00:00:25.99 bitrate=1775.5kbits/ frame= 843 fps= 24 q=29.0 size= 5701kB time=00:00:26.49 bitrate=1762.9kbits/ frame= 859 fps= 24 q=29.0 size= 5830kB time=00:00:27.02 bitrate=1767.0kbits/ frame= 872 fps= 24 q=29.0 size= 5926kB time=00:00:27.46 bitrate=1767.7kbits/ frame= 888 fps= 24 q=29.0 size= 6014kB time=00:00:27.99 bitrate=1759.7kbits/ frame= 900 fps= 24 q=29.0 size= 6332kB time=00:00:28.39 bitrate=1826.9kbits/ frame= 901 fps= 24 q=-1.0 Lsize= 6717kB time=00:00:30.10 bitrate=1828.0kbits /s video:6211kB audio:472kB subtitle:0 global headers:0kB muxing overhead 0.513217% [libx264 @ 00000000002e6980] frame I:8 Avg QP:21.77 size: 39744 [libx264 @ 00000000002e6980] frame P:433 Avg QP:25.69 size: 11490 [libx264 @ 00000000002e6980] frame B:460 Avg QP:29.25 size: 2319 [libx264 @ 00000000002e6980] consecutive B-frames: 5.4% 78.6% 2.7% 13.3% [libx264 @ 00000000002e6980] mb I I16..4: 21.8% 48.8% 29.5% [libx264 @ 00000000002e6980] mb P I16..4: 0.7% 4.0% 1.3% P16..4: 37.1% 22.2 % 15.5% 0.0% 0.0% skip:19.2% [libx264 @ 00000000002e6980] mb B I16..4: 0.1% 0.5% 0.2% B16..8: 43.5% 7.0 % 2.1% direct: 2.2% skip:44.5% L0:36.4% L1:52.7% BI:10.9% [libx264 @ 00000000002e6980] 8x8 transform intra:62.8% inter:56.2% [libx264 @ 00000000002e6980] coded y,uvDC,uvAC intra: 74.2% 78.8% 44.0% inter: 2 3.6% 14.5% 1.0% [libx264 @ 00000000002e6980] i16 v,h,dc,p: 48% 24% 9% 20% [libx264 @ 00000000002e6980] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 16% 17% 15% 7% 8% 11% 8% 10% 8% [libx264 @ 00000000002e6980] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 17% 15% 7% 10% 11% 8% 7% 7% [libx264 @ 00000000002e6980] i8c dc,h,v,p: 53% 21% 18% 7% [libx264 @ 00000000002e6980] Weighted P-Frames: Y:0.7% UV:0.0% [libx264 @ 00000000002e6980] ref P L0: 62.4% 19.0% 12.0% 6.6% 0.0% [libx264 @ 00000000002e6980] ref B L0: 90.5% 8.9% 0.7% [libx264 @ 00000000002e6980] ref B L1: 97.9% 2.1% [libx264 @ 00000000002e6980] kb/s:1692.37 AND THE –MOVFLAGS COMMAND: C:\XSITE\SITE>ffmpeg_new -i C:\vidtests\Wildlife.mp4 -movflags faststart C:\vidtests\Wildlife_fs.mp4 AND THE –MOVFLAGS OUTPUT ffmpeg version N-54207-ge59fb3f Copyright (c) 2000-2013 the FFmpeg developers built on Jun 25 2013 21:55:00 with gcc 4.7.3 (GCC) configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-av isynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enab le-iconv --enable-libass --enable-libbluray --enable-libcaca --enable-libfreetyp e --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --ena ble-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-l ibopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libsp eex --enable-libtheora --enable-libtwolame --enable-libvo-aacenc --enable-libvo- amrwbenc --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libxavs -- enable-libxvid --enable-zlib libavutil 52. 37.101 / 52. 37.101 libavcodec 55. 17.100 / 55. 17.100 libavformat 55. 10.100 / 55. 10.100 libavdevice 55. 2.100 / 55. 2.100 libavfilter 3. 77.101 / 3. 77.101 libswscale 2. 3.100 / 2. 3.100 libswresample 0. 17.102 / 0. 17.102 libpostproc 52. 3.100 / 52. 3.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\vidtests\Wildlife.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 title : Wildlife in HD encoder : Lavf55.10.100 comment : Footage: Small World Productions, Inc; Tourism New Zealand | Producer: Gary F. Spradling | Music: Steve Ball copyright : -¬ 2008 Microsoft Corporation Duration: 00:00:30.13, start: 0.036281, bitrate: 1826 kb/s Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480, 1692 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 12 8 kb/s Metadata: handler_name : SoundHandler [libx264 @ 0000000004360620] using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 [libx264 @ 0000000004360620] profile High, level 3.0 [libx264 @ 0000000004360620] 264 - core 133 r2334 a3ac64b - H.264/MPEG-4 AVC cod ec - Copyleft 2003-2013 - http://www.videolan.org/x264.html - options: cabac=1 r ef=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed _ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pski p=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 deci mate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_ adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=2 5 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.6 0 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, mp4, to 'C:\vidtests\Wildlife_fs.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 title : Wildlife in HD copyright : -¬ 2008 Microsoft Corporation comment : Footage: Small World Productions, Inc; Tourism New Zealand | Producer: Gary F. Spradling | Music: Steve Ball encoder : Lavf55.10.100 Stream #0:0(eng): Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 6 40x480, q=-1--1, 30k tbn, 29.97 tbc Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (libvo_aacenc) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, s16, 128 kb/s Metadata: handler_name : SoundHandler Stream mapping: Stream #0:0 -> #0:0 (h264 -> libx264) Stream #0:1 -> #0:1 (aac -> libvo_aacenc) Press [q] to stop, [?] for help frame= 52 fps=0.0 q=29.0 size= 29kB time=00:00:01.76 bitrate= 133.9kbits/ frame= 63 fps= 60 q=29.0 size= 104kB time=00:00:02.14 bitrate= 397.2kbits/ frame= 74 fps= 47 q=29.0 size= 176kB time=00:00:02.51 bitrate= 573.2kbits/ frame= 87 fps= 41 q=29.0 size= 265kB time=00:00:02.93 bitrate= 741.2kbits/ frame= 101 fps= 37 q=29.0 size= 358kB time=00:00:03.39 bitrate= 862.8kbits/ frame= 113 fps= 34 q=29.0 size= 437kB time=00:00:03.79 bitrate= 943.7kbits/ frame= 125 fps= 33 q=29.0 size= 520kB time=00:00:04.20 bitrate=1012.2kbits/ frame= 138 fps= 32 q=29.0 size= 606kB time=00:00:04.64 bitrate=1069.8kbits/ frame= 151 fps= 31 q=29.0 size= 696kB time=00:00:05.06 bitrate=1124.3kbits/ frame= 163 fps= 30 q=29.0 size= 780kB time=00:00:05.47 bitrate=1166.4kbits/ frame= 176 fps= 30 q=29.0 size= 919kB time=00:00:05.90 bitrate=1273.9kbits/ frame= 196 fps= 31 q=29.0 size= 994kB time=00:00:06.57 bitrate=1237.4kbits/ frame= 213 fps= 31 q=29.0 size= 1097kB time=00:00:07.13 bitrate=1258.8kbits/ frame= 225 fps= 30 q=29.0 size= 1204kB time=00:00:07.53 bitrate=1309.8kbits/ frame= 236 fps= 30 q=29.0 size= 1323kB time=00:00:07.91 bitrate=1369.4kbits/ frame= 249 fps= 29 q=29.0 size= 1451kB time=00:00:08.34 bitrate=1424.6kbits/ frame= 263 fps= 29 q=29.0 size= 1574kB time=00:00:08.82 bitrate=1461.3kbits/ frame= 278 fps= 29 q=29.0 size= 1610kB time=00:00:09.30 bitrate=1416.9kbits/ frame= 296 fps= 30 q=29.0 size= 1655kB time=00:00:09.91 bitrate=1368.0kbits/ frame= 313 fps= 30 q=29.0 size= 1697kB time=00:00:10.48 bitrate=1326.4kbits/ frame= 330 fps= 30 q=29.0 size= 1737kB time=00:00:11.05 bitrate=1286.5kbits/ frame= 345 fps= 30 q=29.0 size= 1776kB time=00:00:11.54 bitrate=1260.4kbits/ frame= 361 fps= 30 q=29.0 size= 1813kB time=00:00:12.07 bitrate=1230.3kbits/ frame= 377 fps= 30 q=29.0 size= 1847kB time=00:00:12.59 bitrate=1201.4kbits/ frame= 395 fps= 30 q=29.0 size= 1880kB time=00:00:13.22 bitrate=1165.0kbits/ frame= 410 fps= 30 q=29.0 size= 1993kB time=00:00:13.72 bitrate=1190.2kbits/ frame= 424 fps= 30 q=29.0 size= 2080kB time=00:00:14.18 bitrate=1201.4kbits/ frame= 439 fps= 30 q=29.0 size= 2166kB time=00:00:14.67 bitrate=1209.4kbits/ frame= 455 fps= 30 q=29.0 size= 2262kB time=00:00:15.21 bitrate=1217.5kbits/ frame= 469 fps= 30 q=29.0 size= 2341kB time=00:00:15.68 bitrate=1223.0kbits/ frame= 484 fps= 30 q=29.0 size= 2430kB time=00:00:16.19 bitrate=1229.1kbits/ frame= 500 fps= 30 q=29.0 size= 2523kB time=00:00:16.71 bitrate=1236.3kbits/ frame= 515 fps= 30 q=29.0 size= 2607kB time=00:00:17.21 bitrate=1240.4kbits/ frame= 531 fps= 30 q=29.0 size= 2681kB time=00:00:17.73 bitrate=1238.2kbits/ frame= 546 fps= 30 q=29.0 size= 2758kB time=00:00:18.24 bitrate=1238.2kbits/ frame= 561 fps= 30 q=29.0 size= 2824kB time=00:00:18.75 bitrate=1233.4kbits/ frame= 576 fps= 30 q=29.0 size= 2955kB time=00:00:19.25 bitrate=1256.8kbits/ frame= 586 fps= 29 q=29.0 size= 3061kB time=00:00:19.59 bitrate=1279.6kbits/ frame= 598 fps= 29 q=29.0 size= 3217kB time=00:00:19.99 bitrate=1318.4kbits/ frame= 610 fps= 29 q=29.0 size= 3354kB time=00:00:20.39 bitrate=1347.2kbits/ frame= 622 fps= 29 q=29.0 size= 3483kB time=00:00:20.78 bitrate=1372.6kbits/ frame= 634 fps= 29 q=29.0 size= 3593kB time=00:00:21.19 bitrate=1388.6kbits/ frame= 648 fps= 29 q=29.0 size= 3708kB time=00:00:21.66 bitrate=1402.3kbits/ frame= 661 fps= 29 q=29.0 size= 3811kB time=00:00:22.08 bitrate=1413.5kbits/ frame= 674 fps= 29 q=29.0 size= 3978kB time=00:00:22.53 bitrate=1446.3kbits/ frame= 690 fps= 29 q=29.0 size= 4133kB time=00:00:23.05 bitrate=1468.4kbits/ frame= 706 fps= 29 q=29.0 size= 4263kB time=00:00:23.58 bitrate=1480.4kbits/ frame= 721 fps= 29 q=29.0 size= 4391kB time=00:00:24.08 bitrate=1493.8kbits/ frame= 735 fps= 29 q=29.0 size= 4524kB time=00:00:24.55 bitrate=1509.4kbits/ frame= 748 fps= 29 q=29.0 size= 4661kB time=00:00:24.98 bitrate=1528.2kbits/ frame= 763 fps= 29 q=29.0 size= 4835kB time=00:00:25.50 bitrate=1553.1kbits/ frame= 778 fps= 29 q=29.0 size= 4993kB time=00:00:25.99 bitrate=1573.6kbits/ frame= 795 fps= 29 q=29.0 size= 5149kB time=00:00:26.56 bitrate=1588.1kbits/ frame= 814 fps= 29 q=29.0 size= 5258kB time=00:00:27.18 bitrate=1584.4kbits/ frame= 833 fps= 29 q=29.0 size= 5368kB time=00:00:27.82 bitrate=1580.2kbits/ frame= 851 fps= 29 q=29.0 size= 5469kB time=00:00:28.43 bitrate=1575.9kbits/ frame= 870 fps= 29 q=29.0 size= 5567kB time=00:00:29.05 bitrate=1569.5kbits/ frame= 889 fps= 29 q=29.0 size= 5688kB time=00:00:29.70 bitrate=1568.4kbits/ Starting second pass: moving header on top of the file frame= 902 fps= 28 q=-1.0 Lsize= 6109kB time=00:00:30.14 bitrate=1659.8kbits /s dup=1 drop=0 video:5602kB audio:472kB subtitle:0 global headers:0kB muxing overhead 0.566600% [libx264 @ 0000000004360620] frame I:8 Avg QP:20.52 size: 39667 [libx264 @ 0000000004360620] frame P:419 Avg QP:25.06 size: 10524 [libx264 @ 0000000004360620] frame B:475 Avg QP:29.03 size: 2123 [libx264 @ 0000000004360620] consecutive B-frames: 3.2% 79.6% 0.3% 16.9% [libx264 @ 0000000004360620] mb I I16..4: 20.7% 52.3% 26.9% [libx264 @ 0000000004360620] mb P I16..4: 0.7% 4.2% 1.1% P16..4: 39.4% 21.4 % 13.8% 0.0% 0.0% skip:19.3% [libx264 @ 0000000004360620] mb B I16..4: 0.1% 0.9% 0.3% B16..8: 41.8% 6.4 % 1.7% direct: 1.7% skip:47.1% L0:36.4% L1:53.3% BI:10.3% [libx264 @ 0000000004360620] 8x8 transform intra:65.7% inter:58.8% [libx264 @ 0000000004360620] coded y,uvDC,uvAC intra: 71.2% 76.6% 35.7% inter: 2 0.7% 13.0% 0.5% [libx264 @ 0000000004360620] i16 v,h,dc,p: 48% 24% 8% 20% [libx264 @ 0000000004360620] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 17% 18% 15% 6% 8% 11% 8% 10% 8% [libx264 @ 0000000004360620] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 16% 15% 7% 10% 11% 8% 8% 7% [libx264 @ 0000000004360620] i8c dc,h,v,p: 51% 22% 19% 9% [libx264 @ 0000000004360620] Weighted P-Frames: Y:0.7% UV:0.0% [libx264 @ 0000000004360620] ref P L0: 63.4% 19.7% 11.0% 5.9% 0.0% [libx264 @ 0000000004360620] ref B L0: 90.7% 8.7% 0.7% [libx264 @ 0000000004360620] ref B L1: 98.4% 1.6% [libx264 @ 0000000004360620] kb/s:1524.54

    Read the article

  • Oracle Data Mining a Star Schema: Telco Churn Case Study

    - by charlie.berger
    There is a complete and detailed Telco Churn case study "How to" Blog Series just posted by Ari Mozes, ODM Dev. Manager.  In it, Ari provides detailed guidance in how to leverage various strengths of Oracle Data Mining including the ability to: mine Star Schemas and join tables and views together to obtain a complete 360 degree view of a customer combine transactional data e.g. call record detail (CDR) data, etc. define complex data transformation, model build and model deploy analytical methodologies inside the Database  His blog is posted in a multi-part series.  Below are some opening excerpts for the first 3 blog entries.  This is an excellent resource for any novice to skilled data miner who wants to gain competitive advantage by mining their data inside the Oracle Database.  Many thanks Ari! Mining a Star Schema: Telco Churn Case Study (1 of 3) One of the strengths of Oracle Data Mining is the ability to mine star schemas with minimal effort.  Star schemas are commonly used in relational databases, and they often contain rich data with interesting patterns.  While dimension tables may contain interesting demographics, fact tables will often contain user behavior, such as phone usage or purchase patterns.  Both of these aspects - demographics and usage patterns - can provide insight into behavior.Churn is a critical problem in the telecommunications industry, and companies go to great lengths to reduce the churn of their customer base.  One case study1 describes a telecommunications scenario involving understanding, and identification of, churn, where the underlying data is present in a star schema.  That case study is a good example for demonstrating just how natural it is for Oracle Data Mining to analyze a star schema, so it will be used as the basis for this series of posts...... Mining a Star Schema: Telco Churn Case Study (2 of 3) This post will follow the transformation steps as described in the case study, but will use Oracle SQL as the means for preparing data.  Please see the previous post for background material, including links to the case study and to scripts that can be used to replicate the stages in these posts.1) Handling missing values for call data recordsThe CDR_T table records the number of phone minutes used by a customer per month and per call type (tariff).  For example, the table may contain one record corresponding to the number of peak (call type) minutes in January for a specific customer, and another record associated with international calls in March for the same customer.  This table is likely to be fairly dense (most type-month combinations for a given customer will be present) due to the coarse level of aggregation, but there may be some missing values.  Missing entries may occur for a number of reasons: the customer made no calls of a particular type in a particular month, the customer switched providers during the timeframe, or perhaps there is a data entry problem.  In the first situation, the correct interpretation of a missing entry would be to assume that the number of minutes for the type-month combination is zero.  In the other situations, it is not appropriate to assume zero, but rather derive some representative value to replace the missing entries.  The referenced case study takes the latter approach.  The data is segmented by customer and call type, and within a given customer-call type combination, an average number of minutes is computed and used as a replacement value.In SQL, we need to generate additional rows for the missing entries and populate those rows with appropriate values.  To generate the missing rows, Oracle's partition outer join feature is a perfect fit.  select cust_id, cdre.tariff, cdre.month, minsfrom cdr_t cdr partition by (cust_id) right outer join     (select distinct tariff, month from cdr_t) cdre     on (cdr.month = cdre.month and cdr.tariff = cdre.tariff);   ....... Mining a Star Schema: Telco Churn Case Study (3 of 3) Now that the "difficult" work is complete - preparing the data - we can move to building a predictive model to help identify and understand churn.The case study suggests that separate models be built for different customer segments (high, medium, low, and very low value customer groups).  To reduce the data to a single segment, a filter can be applied: create or replace view churn_data_high asselect * from churn_prep where value_band = 'HIGH'; It is simple to take a quick look at the predictive aspects of the data on a univariate basis.  While this does not capture the more complex multi-variate effects as would occur with the full-blown data mining algorithms, it can give a quick feel as to the predictive aspects of the data as well as validate the data preparation steps.  Oracle Data Mining includes a predictive analytics package which enables quick analysis. begin  dbms_predictive_analytics.explain(   'churn_data_high','churn_m6','expl_churn_tab'); end; /select * from expl_churn_tab where rank <= 5 order by rank; ATTRIBUTE_NAME       ATTRIBUTE_SUBNAME EXPLANATORY_VALUE RANK-------------------- ----------------- ----------------- ----------LOS_BAND                                      .069167052          1MINS_PER_TARIFF_MON  PEAK-5                   .034881648          2REV_PER_MON          REV-5                    .034527798          3DROPPED_CALLS                                 .028110322          4MINS_PER_TARIFF_MON  PEAK-4                   .024698149          5From the above results, it is clear that some predictors do contain information to help identify churn (explanatory value > 0).  The strongest uni-variate predictor of churn appears to be the customer's (binned) length of service.  The second strongest churn indicator appears to be the number of peak minutes used in the most recent month.  The subname column contains the interior piece of the DM_NESTED_NUMERICALS column described in the previous post.  By using the object relational approach, many related predictors are included within a single top-level column. .....   NOTE:  These are just EXCERPTS.  Click here to start reading the Oracle Data Mining a Star Schema: Telco Churn Case Study from the beginning.    

    Read the article

  • First frame has a much longer delta time than other frames

    - by Kipras
    I had a problem where my AI moved extreme at the first frame and then normal after that. I then figured out it was my delta. It's about 0.016 seconds (60 fps), but the first frame was about 19000 seconds, which is obviously impossible. Does anybody know what might be happening? Also the delta later on likes to oscillate from 0.01 to 0.03, which is, again, crazy. long time = Sys.getTime() * 1000 / Sys.getTimerResolution(); float delta = (time - lastFrame) / 1000f; lastFrame = time; return delta; That's the delta code.

    Read the article

  • SQL Rally Pre-Con: Data Warehouse Modeling – Making the Right Choices

    - by Davide Mauri
    As you may have already learned from my old post or Adam’s or Kalen’s posts, there will be two SQL Rally in North Europe. In the Stockholm SQL Rally, with my friend Thomas Kejser, I’ll be delivering a pre-con on Data Warehouse Modeling: Data warehouses play a central role in any BI solution. It's the back end upon which everything in years to come will be created. For this reason, it must be rock solid and yet flexible at the same time. To develop such a data warehouse, you must have a clear idea of its architecture, a thorough understanding of the concepts of Measures and Dimensions, and a proven engineered way to build it so that quality and stability can go hand-in-hand with cost reduction and scalability. In this workshop, Thomas Kejser and Davide Mauri will share all the information they learned since they started working with data warehouses, giving you the guidance and tips you need to start your BI project in the best way possible?avoiding errors, making implementation effective and efficient, paving the way for a winning Agile approach, and helping you define how your team should work so that your BI solution will stand the test of time. You'll learn: Data warehouse architecture and justification Agile methodology Dimensional modeling, including Kimball vs. Inmon, SCD1/SCD2/SCD3, Junk and Degenerate Dimensions, and Huge Dimensions Best practices, naming conventions, and lessons learned Loading the data warehouse, including loading Dimensions, loading Facts (Full Load, Incremental Load, Partitioned Load) Data warehouses and Big Data (Hadoop) Unit testing Tracking historical changes and managing large sizes With all the Self-Service BI hype, Data Warehouse is become more and more central every day, since if everyone will be able to analyze data using self-service tools, it’s better for him/her to rely on correct, uniform and coherent data. Already 50 people registered from the workshop and seats are limited so don’t miss this unique opportunity to attend to this workshop that is really a unique combination of years and years of experience! http://www.sqlpass.org/sqlrally/2013/nordic/Agenda/PreconferenceSeminars.aspx See you there!

    Read the article

  • Oracle Financial Analytics for SAP Certified with Oracle Data Integrator EE

    - by denis.gray
    Two days ago Oracle announced the release of Oracle Financial Analytics for SAP.  With the amount of press this has garnered in the past two days, there's a key detail that can't be missed.  This release is certified with Oracle Data Integrator EE - now making the combination of Data Integration and Business Intelligence a force to contend with.  Within the Oracle Press Release there were two important bullets: ·         Oracle Financial Analytics for SAP includes a pre-packaged ABAP code compliant adapter and is certified with Oracle Data Integrator Enterprise Edition to integrate SAP Financial Accounting data directly with the analytic application.  ·         Helping to integrate SAP financial data and disparate third-party data sources is Oracle Data Integrator Enterprise Edition which delivers fast, efficient loading and transformation of timely data into a data warehouse environment through its high-performance Extract Load and Transform (E-LT) technology. This is very exciting news, demonstrating Oracle's overall commitment to Oracle Data Integrator EE.   This is a great way to start off the new year and we look forward to building on this momentum throughout 2011.   The following links contain additional information and media responses about the Oracle Financial Analytics for SAP release. IDG News Service (Also appeared in PC World, Computer World, CIO: "Oracle is moving further into rival SAP's turf with Oracle Financial Analytics for SAP, a new BI (business intelligence) application that can crunch ERP (enterprise resource planning) system financial data for insights." Information Week: "Oracle talks a good game about the appeal of an optimized, all-Oracle stack. But the company also recognizes that we live in a predominantly heterogeneous IT world" CRN: "While some businesses with SAP Financial Accounting already use Oracle BI, those integrations had to be custom developed. The new offering provides pre-built integration capabilities." ECRM Guide:  "Among other features, Oracle Financial Analytics for SAP helps front-line managers improve financial performance and decision-making with what the company says is comprehensive, timely and role-based information on their departments' expenses and revenue contributions."   SAP Getting Started Guide for ODI on OTN: http://www.oracle.com/technetwork/middleware/data-integrator/learnmore/index.html For more information on the ODI and its SAP connectivity please review the Oracle® Fusion Middleware Application Adapters Guide for Oracle Data Integrator11g Release 1 (11.1.1)

    Read the article

  • What is the definition of "Big Data"?

    - by Ben
    Is there one? All the definitions I can find describe the size, complexity / variety or velocity of the data. Wikipedia's definition is the only one I've found with an actual number Big data sizes are a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data in a single data set. However, this seemingly contradicts the MIKE2.0 definition, referenced in the next paragraph, which indicates that "big" data can be small and that 100,000 sensors on an aircraft creating only 3GB of data could be considered big. IBM despite saying that: Big data is more simply than a matter of size. have emphasised size in their definition. O'Reilly has stressed "volume, velocity and variety" as well. Though explained well, and in more depth, the definition seems to be a re-hash of the others - or vice-versa of course. I think that a Computer Weekly article title sums up a number of articles fairly well "What is big data and how can it be used to gain competitive advantage". But ZDNet wins with the following from 2012: “Big Data” is a catch phrase that has been bubbling up from the high performance computing niche of the IT market... If one sits through the presentations from ten suppliers of technology, fifteen or so different definitions are likely to come forward. Each definition, of course, tends to support the need for that supplier’s products and services. Imagine that. Basically "big data" is "big" in some way shape or form. What is "big"? Is it quantifiable at the current time? If "big" is unquantifiable is there a definition that does not rely solely on generalities?

    Read the article

  • How to check if a Webcam is broken?

    - by user84812
    I just bought an Acer Aspire 3830TG, it comes with an integrated 1.3M HD Webcam. Before buying it i tried with a bootable Lubuntu usb stick, everything worked well except for the webcam, which i thought I had to tweak. The thing is that it seems the camera should work with no problems in ubuntu. The driver is detected, I try dmesg | grep uvcvideo and the output is [ 12.226174] uvcvideo: Found UVC 1.00 device 1.3M HD WebCam (058f:b002) [ 12.245553] usbcore: registered new interface driver uvcvideo I've also tried using different software (guvcview is black when camera output is MJPG and turns to funny colors when YU12 or YV12, cheese is always black, camorama is always with funny colors...). I should have checked that it was working properly with the default os (windows) but now it's too late for that. I even booted with a official Ubuntu Quantal distro from the usb pen, and the results are the same. so, my question is: is there any way to check that the camera is righmt or broken? So, if it's broken, at least i can go to the shop, show them that it's really broken and get an external webcam for free, or st like that. cheers. UPDATE 1 Thanks, jrp. I run sudo lsinput, and the output info about my video is the following: /dev/input/event6 bustype : BUS_USB vendor : 0x58f product : 0xb002 version : 2 name : "1.3M HD WebCam" phys : "usb-0000:00:1a.0-1.3/button" bits ev : EV_SYN EV_KEY /dev/input/event7 bustype : BUS_HOST vendor : 0x0 product : 0x6 version : 0 name : "Video Bus" phys : "LNXVIDEO/video/input0" bits ev : EV_SYN EV_KEY With this info, i'm not pretty sure about running the luvcview command. If I run luvcview -d /dev/video0 -L, the output is the following: SDL information: Video driver: x11 A window manager is available Device information: Device path: /dev/video0 { pixelformat = 'YUYV', description = 'YUV 4:2:2 (YUYV)' } { discrete: width = 640, height = 480 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 160, height = 120 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 176, height = 144 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 320, height = 240 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 352, height = 288 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 720 } Time interval between frame: 1/7, 1/5, { discrete: width = 1280, height = 800 } Time interval between frame: 1/7, 1/5, { discrete: width = 1280, height = 960 } Time interval between frame: 1/7, 1/5, { discrete: width = 1280, height = 1024 } Time interval between frame: 1/7, 1/5, { pixelformat = 'MJPG', description = 'MJPEG' } { discrete: width = 640, height = 480 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 160, height = 120 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 176, height = 144 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 320, height = 240 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 352, height = 288 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 720 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 800 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 960 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 1024 } Time interval between frame: 1/15, 1/10, 1/5, { pixelformat = 'RGB3', description = 'RGB3' } { discrete: width = 640, height = 480 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 160, height = 120 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 176, height = 144 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 320, height = 240 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 352, height = 288 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 720 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 800 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 960 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 1024 } Time interval between frame: 1/15, 1/10, 1/5, { pixelformat = 'BGR3', description = 'BGR3' } { discrete: width = 640, height = 480 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 160, height = 120 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 176, height = 144 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 320, height = 240 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 352, height = 288 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 720 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 800 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 960 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 1024 } Time interval between frame: 1/15, 1/10, 1/5, { pixelformat = 'YU12', description = 'YU12' } { discrete: width = 640, height = 480 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 160, height = 120 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 176, height = 144 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 320, height = 240 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 352, height = 288 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 720 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 800 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 960 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 1024 } Time interval between frame: 1/15, 1/10, 1/5, { pixelformat = 'YV12', description = 'YV12' } { discrete: width = 640, height = 480 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 160, height = 120 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 176, height = 144 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 320, height = 240 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 352, height = 288 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 720 } Time interval between frame: 1/30, 1/25, 1/20, 1/15, 1/10, 1/5, { discrete: width = 1280, height = 800 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 960 } Time interval between frame: 1/15, 1/10, 1/5, { discrete: width = 1280, height = 1024 } Time interval between frame: 1/15, 1/10, 1/5, if i run luvcview by itself, the image is funny (blue and red colors, mainly, with myself in negative state). tx

    Read the article

  • Big Data – Operational Databases Supporting Big Data – Columnar, Graph and Spatial Database – Day 14 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the Key-Value Pair Databases and Document Databases in the Big Data Story. In this article we will understand the role of Columnar, Graph and Spatial Database supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (The day before yesterday’s post) NoSQL Databases (The day before yesterday’s post) Key-Value Pair Databases (Yesterday’s post) Document Databases (Yesterday’s post) Columnar Databases (Tomorrow’s post) Graph Databases (Today’s post) Spatial Databases (Today’s post) Columnar Databases  Relational Database is a row store database or a row oriented database. Columnar databases are column oriented or column store databases. As we discussed earlier in Big Data we have different kinds of data and we need to store different kinds of data in the database. When we have columnar database it is very easy to do so as we can just add a new column to the columnar database. HBase is one of the most popular columnar databases. It uses Hadoop file system and MapReduce for its core data storage. However, remember this is not a good solution for every application. This is particularly good for the database where there is high volume incremental data is gathered and processed. Graph Databases For a highly interconnected data it is suitable to use Graph Database. This database has node relationship structure. Nodes and relationships contain a Key Value Pair where data is stored. The major advantage of this database is that it supports faster navigation among various relationships. For example, Facebook uses a graph database to list and demonstrate various relationships between users. Neo4J is one of the most popular open source graph database. One of the major dis-advantage of the Graph Database is that it is not possible to self-reference (self joins in the RDBMS terms) and there might be real world scenarios where this might be required and graph database does not support it. Spatial Databases  We all use Foursquare, Google+ as well Facebook Check-ins for location aware check-ins. All the location aware applications figure out the position of the phone with the help of Global Positioning System (GPS). Think about it, so many different users at different location in the world and checking-in all together. Additionally, the applications now feature reach and users are demanding more and more information from them, for example like movies, coffee shop or places see. They are all running with the help of Spatial Databases. Spatial data are standardize by the Open Geospatial Consortium known as OGC. Spatial data helps answering many interesting questions like “Distance between two locations, area of interesting places etc.” When we think of it, it is very clear that handing spatial data and returning meaningful result is one big task when there are millions of users moving dynamically from one place to another place & requesting various spatial information. PostGIS/OpenGIS suite is very popular spatial database. It runs as a layer implementation on the RDBMS PostgreSQL. This makes it totally unique as it offers best from both the worlds. Courtesy: mushroom network Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Hive. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Can't save data for a member in a data form

    - by RahulS
    Implied sharing is an old thing everyone knows the reasons and solutions of that, still little theory about that: With Essbase implied sharing, some members are shared even if you do not explicitly set them as shared. These members are implied shared members. When an implied share relationship is created, each implied member assumes the other member’s value. Essbase assumes (or implies) a shared member relationship in these situations: 1. A parent has only one child 2. A parent has only one child that consolidates to the parent In a Planning form that contains members with an implied sharing relationship, when a value is added for the parent, the child assumes the same value after the form is saved. Likewise, if a value is added for the child, the parent usually assumes the same value after a form is saved.For example, when a calculation script or load rule populates an implied share member, the other implied share member assumes the value of the member populated by the calculation script or load rule. The last value calculated or imported takes precedence. The result is the same whether you refer to the parent or the child as a variable in a calculation script. For more information have a look at: http://docs.oracle.com/cd/E17236_01/epm.1112/hp_admin_11122/ch14s11.html Now the issue which we are going to talk about is We loose data on save even when the parent is dynamic calc and has a single child. A dynamic calc parent to a single child:  If we design the form with following selection: In the data form we will find parent below the member and this is by design whenever you make a selection using commands to select all the member below parent, always children will appear before the parent: Lets try to enter data, Save it Now, try to change the way we selected members Here we go: Now the question again why this behavior: 1. Data from Planning data form passes to Essbase row by row, 2. Because in data form the child member appears before the parent, 3. First, data goes to Essbase for child (SingleStoreChild), 4. Then when Planning passes the data for parent there was #Missing or No data,  5. Over writes the data to #missing. PS: As we know that dynamic calc members are calculated on the fly they are not allocated with any memory in the Essbase, here the parent was dynamic calc and it was pointing to same memory as child in the background, when Planning was passing data to Essbase for second row it has updated the child with missing data.(Little confusing, let me know if you need more explanation) 6. As one of the solutions just change the order of appearance of parent and child. Cheers..!!! Rahul S. https://www.facebook.com/pages/HyperionPlanning/117320818374228

    Read the article

  • Parse and read data frame in C?

    - by user253656
    I am writing a program that reads the data from the serial port on Linux. The data are sent by another device with the following frame format: |start | Command | Data | CRC | End | |0x02 | 0x41 | (0-127 octets) | | 0x03| ---------------------------------------------------- The Data field contains 127 octets as shown and octet 1,2 contains one type of data; octet 3,4 contains another data. I need to get these data I know how to write and read data to and from a serial port in Linux, but it is just to write and read a simple string (like "ABD") My issue is that I do not know how to parse the data frame formatted as above so that I can: get the data in octet 1,2 in the Data field get the data in octet 3,4 in the Data field get the value in CRC field to check the consistency of the data Here the sample snip code that read and write a simple string from and to a serial port in Linux: int writeport(int fd, char *chars) { int len = strlen(chars); chars[len] = 0x0d; // stick a <CR> after the command chars[len+1] = 0x00; // terminate the string properly int n = write(fd, chars, strlen(chars)); if (n < 0) { fputs("write failed!\n", stderr); return 0; } return 1; } int readport(int fd, char *result) { int iIn = read(fd, result, 254); result[iIn-1] = 0x00; if (iIn < 0) { if (errno == EAGAIN) { printf("SERIAL EAGAIN ERROR\n"); return 0; } else { printf("SERIAL read error %d %s\n", errno, strerror(errno)); return 0; } } return 1; } Does anyone please have some ideas? Thanks all.

    Read the article

  • Is Data Science “Science”?

    - by BuckWoody
    I hold the term “science” in very high esteem. I grew up on the Space Coast in Florida, and eventually worked at the Kennedy Space Center, surrounded by very intelligent people who worked in various scientific fields. Recently a new term has entered the computing dialog – “Data Scientist”. Since it’s not a standard term, it has a lot of definitions, and in fact has been disputed as a correct term. After all, the reasoning goes, if there’s no such thing as “Data Science” then how can there be a Data Scientist? This argument has been made before, albeit with a different term – “Computer Science”. In Peter Denning’s excellent article “Is Computer Science Science” (April  2005/Vol. 48, No. 4 COMMUNICATIONS OF THE ACM) there are many points that separate “science” from “engineering” and even “art”.  I won’t repeat the content of that article here (I recommend you read it on your own) but will leverage the points he makes there. Definition of Science To ask the question “is data science ‘science’” then we need to start with a definition of terms. Various references put the definition into the same basic areas: Study of the physical world Systematic and/or disciplined study of a subject area ...and then they include the things studied, the bodies of knowledge and so on. The word itself comes from Latin, and means merely “to know” or “to study to know”. Greek divides knowledge further into “truth” (episteme), and practical use or effects (tekhne). Normally computing falls into the second realm. Definition of Data Science And now a more controversial definition: Data Science. This term is so new and perhaps so niche that the major dictionaries haven’t yet picked it up (my OED reference is older – can’t afford to pop for the online registration at present). Researching the term's general use I created an amalgam of the definitions this way: “Studying and applying mathematical and other techniques to derive information from complex data sets.” Using this definition, data science certainly seems to be science - it's learning about and studying some object or area using systematic methods. But implicit within the definition is the word “application”, which makes the process more akin to engineering or even technology than science. In fact, I find that using these techniques – and data itself – part of science, not science itself. I leave out the concept of studying data patterns or algorithms as part of this discipline. That is actually a domain I see within research, mathematics or computer science. That of course is a type of science, but does not seek for practical applications. As part of the argument against calling it “Data Science”, some point to the scientific method of creating a hypothesis, testing with controls, testing results against the hypothesis, and documenting for repeatability.  These are not steps that we often take in working with data. We normally start with a question, and fit patterns and algorithms to predict outcomes and find correlations. In this way Data Science is more akin to statistics (and in fact makes heavy use of them) in the process rather than starting with an assumption and following on with it. So, is Data Science “Science”? I’m uncertain – and I’m uncertain it matters. Even if we are facing rampant “title inflation” these days (does anyone introduce themselves as a secretary or supervisor anymore?) I can tolerate the term at least from the intent that we use data to study problems across a wide spectrum, rather than restricting it to a single domain. And I also understand those who have worked hard to achieve the very honorable title of “scientist” who have issues with those who borrow the term without asking. What do you think? Science, or not? Does it matter?

    Read the article

  • Data Modeling Resources

    - by Dejan Sarka
    You can find many different data modeling resources. It is impossible to list all of them. I selected only the most valuable ones for me, and, of course, the ones I contributed to. Books Chris J. Date: An Introduction to Database Systems – IMO a “must” to understand the relational model correctly. Terry Halpin, Tony Morgan: Information Modeling and Relational Databases – meet the object-role modeling leaders. Chris J. Date, Nikos Lorentzos and Hugh Darwen: Time and Relational Theory, Second Edition: Temporal Databases in the Relational Model and SQL – all theory needed to manage temporal data. Louis Davidson, Jessica M. Moss: Pro SQL Server 2012 Relational Database Design and Implementation – the best SQL Server focused data modeling book I know by two of my friends. Dejan Sarka, et al.: MCITP Self-Paced Training Kit (Exam 70-441): Designing Database Solutions by Using Microsoft® SQL Server™ 2005 – SQL Server 2005 data modeling training kit. Most of the text is still valid for SQL Server 2008, 2008 R2, 2012 and 2014. Itzik Ben-Gan, Lubor Kollar, Dejan Sarka, Steve Kass: Inside Microsoft SQL Server 2008 T-SQL Querying – Steve wrote a chapter with mathematical background, and I added a chapter with theoretical introduction to the relational model. Itzik Ben-Gan, Dejan Sarka, Roger Wolter, Greg Low, Ed Katibah, Isaac Kunen: Inside Microsoft SQL Server 2008 T-SQL Programming – I added three chapters with theoretical introduction and practical solutions for the user-defined data types, dynamic schema and temporal data. Dejan Sarka, Matija Lah, Grega Jerkic: Training Kit (Exam 70-463): Implementing a Data Warehouse with Microsoft SQL Server 2012 – my first two chapters are about data warehouse design and implementation. Courses Data Modeling Essentials – I wrote a 3-day course for SolidQ. If you are interested in this course, which I could also deliver in a shorter seminar way, you can contact your closes SolidQ subsidiary, or, of course, me directly on addresses [email protected] or [email protected]. This course could also complement the existing courseware portfolio of training providers, which are welcome to contact me as well. Logical and Physical Modeling for Analytical Applications – online course I wrote for Pluralsight. Working with Temporal data in SQL Server – my latest Pluralsight course, where besides theory and implementation I introduce many original ways how to optimize temporal queries. Forthcoming presentations SQL Bits 12, July 17th – 19th, Telford, UK – I have a full-day pre-conference seminar Advanced Data Modeling Topics there.

    Read the article

  • Deploying Data Mining Models using Model Export and Import

    - by [email protected]
    In this post, we'll take a look at how Oracle Data Mining facilitates model deployment. After building and testing models, a next step is often putting your data mining model into a production system -- referred to as model deployment. The ability to move data mining model(s) easily into a production system can greatly speed model deployment, and reduce the overall cost. Since Oracle Data Mining provides models as first class database objects, models can be manipulated using familiar database techniques and technology. For example, one or more models can be exported to a flat file, similar to a database table dump file (.dmp). This file can be moved to a different instance of Oracle Database EE, and then imported. All methods for exporting and importing models are based on Oracle Data Pump technology and found in the DBMS_DATA_MINING package. Before performing the actual export or import, a directory object must be created. A directory object is a logical name in the database for a physical directory on the host computer. Read/write access to a directory object is necessary to access the host computer file system from within Oracle Database. For our example, we'll work in the DMUSER schema. First, DMUSER requires the privilege to create any directory. This is often granted through the sysdba account. grant create any directory to dmuser; Now, DMUSER can create the directory object specifying the path where the exported model file (.dmp) should be placed. In this case, on a linux machine, we have the directory /scratch/oracle. CREATE OR REPLACE DIRECTORY dmdir AS '/scratch/oracle'; If you aren't sure of the exact name of the model or models to export, you can find the list of models using the following query: select model_name from user_mining_models; There are several options when exporting models. We can export a single model, multiple models, or all models in a schema using the following procedure calls: BEGIN   DBMS_DATA_MINING.EXPORT_MODEL ('MY_MODEL.dmp','dmdir','name =''MY_DT_MODEL'''); END; BEGIN   DBMS_DATA_MINING.EXPORT_MODEL ('MY_MODELS.dmp','dmdir',              'name IN (''MY_DT_MODEL'',''MY_KM_MODEL'')'); END; BEGIN   DBMS_DATA_MINING.EXPORT_MODEL ('ALL_DMUSER_MODELS.dmp','dmdir'); END; A .dmp file can be imported into another schema or database using the following procedure call, for example: BEGIN   DBMS_DATA_MINING.IMPORT_MODEL('MY_MODELS.dmp', 'dmdir'); END; As with models from any data mining tool, when moving a model from one environment to another, care needs to be taken to ensure the transformations that prepare the data for model building are matched (with appropriate parameters and statistics) in the system where the model is deployed. Oracle Data Mining provides automatic data preparation (ADP) and embedded data preparation (EDP) to reduce, or possibly eliminate, the need to explicitly transport transformations with the model. In the case of ADP, ODM automatically prepares the data and includes the necessary transformations in the model itself. In the case of EDP, users can associate their own transformations with attributes of a model. These transformations are automatically applied when applying the model to data, i.e., scoring. Exporting and importing a model with ADP or EDP results in these transformations being immediately available with the model in the production system.

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >