Search Results

Search found 30279 results on 1212 pages for 'database drift'.

Page 609/1212 | < Previous Page | 605 606 607 608 609 610 611 612 613 614 615 616  | Next Page >

  • What algorithms can I use to detect if articles or posts are duplicates?

    - by michael
    I'm trying to detect if an article or forum post is a duplicate entry within the database. I've given this some thought, coming to the conclusion that someone who duplicate content will do so using one of the three (in descending difficult to detect): simple copy paste the whole text copy and paste parts of text merging it with their own copy an article from an external site and masquerade as their own Prepping Text For Analysis Basically any anomalies; the goal is to make the text as "pure" as possible. For more accurate results, the text is "standardized" by: Stripping duplicate white spaces and trimming leading and trailing. Newlines are standardized to \n. HTML tags are removed. Using a RegEx called Daring Fireball URLs are stripped. I use BB code in my application so that goes to. (ä)ccented and foreign (besides Enlgish) are converted to their non foreign form. I store information about each article in (1) statistics table and in (2) keywords table. (1) Statistics Table The following statistics are stored about the textual content (much like this post) text length letter count word count sentence count average words per sentence automated readability index gunning fog score For European languages Coleman-Liau and Automated Readability Index should be used as they do not use syllable counting, so should produce a reasonably accurate score. (2) Keywords Table The keywords are generated by excluding a huge list of stop words (common words), e.g., 'the', 'a', 'of', 'to', etc, etc. Sample Data text_length, 3963 letter_count, 3052 word_count, 684 sentence_count, 33 word_per_sentence, 21 gunning_fog, 11.5 auto_read_index, 9.9 keyword 1, killed keyword 2, officers keyword 3, police It should be noted that once an article gets updated all of the above statistics are regenerated and could be completely different values. How could I use the above information to detect if an article that's being published for the first time, is already existing within the database? I'm aware anything I'll design will not be perfect, the biggest risk being (1) Content that is not a duplicate will be flagged as duplicate (2) The system allows the duplicate content through. So the algorithm should generate a risk assessment number from 0 being no duplicate risk 5 being possible duplicate and 10 being duplicate. Anything above 5 then there's a good possibility that the content is duplicate. In this case the content could be flagged and linked to the article's that are possible duplicates and a human could decide whether to delete or allow. As I said before I'm storing keywords for the whole article, however I wonder if I could do the same on paragraph basis; this would also mean further separating my data in the DB but it would also make it easier for detecting (2) in my initial post. I'm thinking weighted average between the statistics, but in what order and what would be the consequences...

    Read the article

  • What Counts for a DBA: Passion

    - by drsql
    One of my first questions, when interviewing for a DBA/Programmer position, is always: “Why do you want this job?” The answers I receive range from cheesy hyperbole (“I want to enhance your services with my vast knowledge”) to deadpan realism (“I have N kids who all have a hole in the front of their face where food goes"). Both answers are fine in their own way, at least displaying some self-confidence, humour and honesty, but once in a while, I'll hear the answer that is music to me ears... “I LOVE DATABASES!” Whenever I hear it, my nerves tingle in hopeful anticipation; have I found someone for whom working with database isn't just a job, but a passion? Inevitably, I'm often disappointed. What initially seemed like passion turns out to be rather shallow enthusiasm; the person is enthusiastic about working with databases in the same way he or she might be about eating a bag of Cajun spiced kettle chips; enjoyable, but not something to think about too deeply or take too seriously. Enthusiasm comes, and enthusiasm goes. I've seen countless technical forum users burst onto the scene in a blaze of frantic question-answering, only to fade away within days, never to be heard from again. Passion, however, is more of a longstanding commitment. The biographies of the great technologists and authors of the recent past are full of the sort of passion and engrossment that lead a person to write a novel non-stop for a fortnight with no sleep and only dog food to eat (Philip K. Dick), or refuse to leave the works of the first tunnel under the Thames, even though it was flooded (Brunel). In a similar (though more modest) way, my passion for working with databases has led me to acts that might cause someone for whom it was "just a job" to roll their eyes in disbelief. Most evenings you're more likely to find me reading a database book than watching TV. I've spent hundreds of hours of my spare time writing blogs and articles (some of which are only read by tens of people); I've spent hundreds of dollars travelling to conferences, paying my own flight and hotel expenses, so that I can share a little of what I know, and mix with some like-minded people. And I know I'm far from alone in this, in the SQL Server community. Passion isn't everything, of course, and it isn't always accompanied by any great skill, but in almost every case, that skill can be cultivated over time. If you are doing what you are passionate about, work turns into more than just a way to feed your kids; it becomes your hobby, entertainment, and preoccupation. And it is this passion that gives a DBA the obsessive stubbornness, the refusal to be beaten by even the most difficult problem, which is often so crucial. A final word of warning though: passion without limits can turn weird. Never let it get in the way of your wife, kids, bills, or personal hygiene.

    Read the article

  • Strategy to use two different measurement systems in software

    - by Dennis
    I have an application that needs to accept and output values in both US Custom Units and Metric system. Right now the conversion and input and output is a mess. You can only enter in US system, but you can choose the output to be US or Metric, and the code to do the conversions is everywhere. So I want to organize this and put together some simple rules. So I came up with this: Rules user can enter values in either US or Metric, and User Interface will take care of marking this properly All units internally will be stored as US, since the majority of the system already has most of the data stored like that and depends on this. It shouldn't matter I suppose as long as you don't mix unit. All output will be in US or Metric, depending on user selection/choice/preference. In theory this sounds great and seems like a solution. However, one little problem I came across is this: There is some data stored in code or in the database that already returns data like this: 4 x 13/16" screws, which means "four times screws". I need the to be in either US or Metric. Where exactly do I put the conversion code for doing the conversion for this unit? The above already mixing presentation and data, but the data for the field I need to populate is that whole string. I can certainly split it up into the number 4, the 13/16", and the " x " and the " screws", but the question remains... where do I put the conversion code? Different Locations for Conversion Routines 1) Right now the string is in a class where it's produced. I can put conversion code right into that class and it may be a good solution. Except then, I want to be consistent so I will be putting conversion procedures everywhere in the code at-data-source, or right after reading it from the database. The problem though is I think that my code will have to deal with two systems, all throughout the codebase after this, should I do this. 2) According to the rules, my idea was to put it in the view script, aka last change to modify it before it is shown to the user. And it may be the right thing to do, but then it strikes me it may not always be the best solution. (First, it complicates the view script a tad, second, I need to do more work on the data side to split things up more, or do extra parsing, such as in my case above). 3) Another solution is to do this somewhere in the data prep step before the view, aka somewhere in the middle, before the view, but after the data-source. This strikes me as messy and that could be the reason why my codebase is in such a mess right now. It seems that there is no best solution. What do I do?

    Read the article

  • SSIS Basics: Setting Up Your Initial Package

    Up until now, it has been a curiously frustrating search to find out the basics of SSIS, fast, in order to get up and running quickly. No longer, as Annette Allen comes up with a simple introduction for the rest of us. What are your servers really trying to tell you? Find out with new SQL Monitor 3.0, an easy-to-use tool built for no-nonsense database professionals.For effortless insights into SQL Server, download a free trial today.

    Read the article

  • Oracle to SQL Server: Crossing the Great Divide, Part 1

    When a SQL expert moves from Oracle to SQL Server, he can spot obvious strengths and weaknesses in the product that are too familiar to be apparent to the SQL Server DBA. Jonathan Lewis is one such expert: In this article he records his train of thought whilst investigating the mechanics of the SQL Server database engine. The result makes interesting reading.

    Read the article

  • Red Gate Software announces speaker line up for US SQL in the City tour

    SQL in the City is a free, full day training and networking event for database professionals. After the success of last year’s event, Red Gate has expanded the event to cover six cities from sea to shining sea, including: New York, Austin, San Francisco, Chicago, Boston, and Seattle. Compress live data by 73% Red Gate's SQL Storage Compress reduces the size of live SQL Server databases, saving you disk space and storage costs. Learn more.

    Read the article

  • Celko's SQL Stumper: Eggs in one Basket

    Joe Celko returns with another stumper to celebrate Easter. Unsurprisingly, this involves eggs. More surprising is the nature of the puzzle: This time, the puzzle is one of designing a database rather than a query. DDL as well as the DML.

    Read the article

  • SQL SERVER What is AdventureWorks?

    NOTE: If you know the answer of this question, then I request you to stop reading this post right now. Please do not leave comment about this blog post not being useful to you, if you knew the answer. Few days ago, I received DM asking What is an AdventureWorks database and why in all [...]...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Backup Compression - time for an overhaul

    - by jchang
    Database backup compression is incredibly useful and valuable. This became popular with then Imceda (later Quest and now Dell) LiteSpeed. SQL Server version 2008 added backup compression for Enterprise Edition only. The SQL Server EE native backup feature only allows a single compression algorithm, one that elects for CPU efficiency over the degree of compression achieved. In the long ago past, this strategy was essential. But today the benefits are irrelevant while the lower compression is becoming...(read more)

    Read the article

  • Critical Patch Updates During EBS 11i Exception to Sustaining Support Period

    - by Elke Phelps (Oracle Development)
    As previously blogged in the EBS 11i and 12.1 Support Timeline Changes entry, two important changes to the Oracle Lifetime Support policies were announced at Oracle OpenWorld 2012 - San Francisco.  These changes affect E-Business Suite Releases 11i and 12.1. Critical Patch Updates for EBS 11i during the Exception to Sustaining Support Period You may be wondering about the availability of Critical Patch Updates (CPU) for EBS 11i during the Exception to Sustaining Support period.  The following details the E-Business Suite Critical Patch Update support policy for EBS 11i during the Exception to Sustaining Support period: Oracle will continue to provide CPUs containing critical security fixes for E-Business Suite 11i.  CPUs will be packaged and released as as cumulative patches for both ATG RUP 6 and ATG RUP 7. As always, we try to minimize the number of patches and dependencies required for uptake of a CPU; however, there have been quite a few changes to the 11i baseline since its release.  For dependency reasons the 11i CPUs may require a higher number of files in order to bring them up to a consistent, stable, and well tested level. EBS 11i customer will continue to receive CPUs up to and including the October 2014 CPU. Where can I learn more? There are two interlocking policies that affect the E-Business Suite:  Oracle's Lifetime Support policies for each EBS release (timelines which were updated by this announcement), and the Error Correction Support policies (which state the minimum baselines for new patches). For more information about how these policies interact, see: Understanding Support Windows for E-Business Suite Releases What about E-Business Suite technology stack components? Things get more complicated when one considers individual techstack components such as Oracle Forms or the Oracle Database.  To learn more about the interlocking EBS+techstack component support windows, see these two articles: On Apps Tier Patching and Support: A Primer for E-Business Suite Users On Database Patching and Support: A Primer for E-Business Suite Users Where can I learn more about Critical Patch Updates?The Critical Patch Update Advisory is the starting point for relevant information. It includes a list of products affected, pointers to obtain the patches, a summary of the security vulnerabilities, and links to other important documents.  Related Articles EBS 11i and 12.1 Support Timeline Changes Frequently Asked Questions about Latest EBS Support Changes Extended Support Fees Waived for E-Business Suite 11i and 12.0

    Read the article

  • Register Now to the New Oracle Argus Safety 7 Implementation Boot Camp in Miami, Florida - Nov 12-15, 2013!

    - by Roxana Babiciu
    Oracle's Argus Safety 7 boot camp is an instructor-led training course which provides a good understanding of how Oracle Argus Safety Standard Edition and Oracle Argus Safety Japan products addresses complex pharmacovigilance requirements and helps ensure global regulatory compliance by enabling sound safety decisions. Oracle Argus Safety's advanced database helps ensure global regulatory compliance thus in turn enabling sound safety decisions. Register now to this boot camp, a 4-day (in class) instructor led event taught using a combination of lectures and hands-on exercises.

    Read the article

  • Upgrade Workshops in Bucharest, Athens and Warsaw

    - by Mike Dietrich
    Finally travel time is not over yet. There are 3 more workshops Upgrade, Migrate & Consolidate to Oracle Database 12c due to happen within the next few weeks:. June 17 in Bucharest, Romaniain the Radisson Blu Hotel - Register here!. July 10 in Athens, Greece in the Pentelikon Hotel - Register here!. July 15 in Warsaw, Poland in the Marriot Warsaw Hotel - Register here!. - CU there - Mike 

    Read the article

  • Stuxnet - how it infects

    - by Kit Ong
    Except from the CNET article.http://news.cnet.com/8301-13772_3-57413329-52/stuxnet-delivered-to-iranian-nuclear-plant-on-thumb-drive/?part=propeller&subj=news&tag=linkvThe Stuxnet worm propagates by exploiting a hole in all versions of Windows in the code that processes shortcut files, ending in ".lnk," according to...[the] Microsoft Malware Protection Center....Merely browsing to the removable media drive using an application that displays shortcut icons, such as Windows Explorer, will run the malware without the user clicking on the icons. The worm infects USB drives or other removable storage devices that are subsequently connected to the infected machine. Those USB drives then infect other machines much like the common cold is spread by infected people sneezing into their hands and then touching door knobs that others are handling.The malware includes a rootkit, which is software designed to hide the fact that a computer has been compromised, and other software that sneaks onto computers by using a digital certificates signed two Taiwanese chip manufacturers that are based in the same industrial complex in Taiwan--RealTek and JMicron, according to Chester Wisniewski, senior security advisor at Sophos.... It is unclear how the digital signatures were acquired by the attacker, but experts believe they were stolen and that the companies were not involved.Once the machine is infected, a Trojan looks to see if the computer it lands on is running Siemens' Simatic WinCC software. The malware then automatically uses a default password that is hard-coded into the software to access the control system's Microsoft SQL database. The Stuxnet worm propagates by exploiting a hole in all versions of Windows in the code that processes shortcut files, ending in ".lnk," according to...[the] Microsoft Malware Protection Center....Merely browsing to the removable media drive using an application that displays shortcut icons, such as Windows Explorer, will run the malware without the user clicking on the icons. The worm infects USB drives or other removable storage devices that are subsequently connected to the infected machine. Those USB drives then infect other machines much like the common cold is spread by infected people sneezing into their hands and then touching door knobs that others are handling.The malware includes a rootkit, which is software designed to hide the fact that a computer has been compromised, and other software that sneaks onto computers by using a digital certificates signed two Taiwanese chip manufacturers that are based in the same industrial complex in Taiwan--RealTek and JMicron, according to Chester Wisniewski, senior security advisor at Sophos.... It is unclear how the digital signatures were acquired by the attacker, but experts believe they were stolen and that the companies were not involved.Once the machine is infected, a Trojan looks to see if the computer it lands on is running Siemens' Simatic WinCC software. The malware then automatically uses a default password that is hard-coded into the software to access the control system's Microsoft SQL database.

    Read the article

  • Last chance for a day of free SQL Server training at SQL in the City 2012

    SQL Server developers and database administrators have one last chance for a full day of free training and networking at SQL in the City 2012. NEW! Deployment Manager Early Access ReleaseDeploy SQL Server changes and .NET applications fast, frequently, and without fuss, using Deployment Manager, the new tool from Red Gate. Try the Early Access Release to get a 20% discount on Version 1. Download the Early Access Release.

    Read the article

  • Building a Scale Out SSRS 2008 R2 Farm using Windows NLB Part 4

    Delivering reports is becoming more critical due to the increasing demand for business intelligence solutions. And while there are a lot of guides that walk us through building a highly available database engine, you’ll rarely see one for SQL Server Reporting Services. How do I go about building a scale-out SQL Server 2008 R2 Reporting Services running on Windows Server 2008 R2? Get smart with SQL Backup ProPowerful centralised management, encryption and more.SQL Backup Pro was the smartest kid at school. Discover why.

    Read the article

  • Architecture for a site "blog"/"news reel"

    - by DavidScherer
    I really don't want to undertake handling blog/news posts within a site I'm working on and would much rather use some other software that's fairly bare-bones that will manage the posts and then I can just pull posts from the DB or an API. Does anyone have any experience with a nice, lightweight OS Blog type software that has either an API or is basic enough to simply pull Data from the database? I really only need the software for managing, I plan to display all the posts programatically through MVC.

    Read the article

  • ArchBeat Link-o-Rama for 10-24-2012

    - by Bob Rhubart
    Play Oracle Vanquisher Here's a little respite from whatever it is you normally spend your time on. Oracle Vanquisher is an online diversion that makes a game of data center optimization. According to the description: "Armed with a cool Oracle vacuum pack suit and a strategic IT roadmap, you will thwart threats and optimize your data center to increase your company’s stock price and boost your company's position." Mainly you avoid electric shock and killer birds. The current high score belongs to someone identified as "TEN." My score? Never mind. Book: DevOps for Developers | The Java Source The subject of DevOps has come up in a couple of recent OTN ArchBeat Podcasts, so it's somewhat serendipitous that Tori Weildt's recent blog post offers an overview of Java Champion Michael Hutterman's new book, DevOps for Developers, now available from Apress. Bring Your Own Device (BYOD) : Context is everything… | The ORACLE-BASE Blog BOYD is a factor in the evolution of IT, but in what context? "The real IT work in companies is still being done on PCs," says Oracle ACE Director Tim Hall. "Yes, you can use a cloud service on your phone, but look around the office and you will see those cloud services are actually being used by people on PCs." Oracle in the Cloud: Oracle EBusiness Suite sizing | Tom Laszewski Cloud expert Tom Laszewski shares several technical resources that will be helpful for sizing of Oracle EBusiness Suite. Setting Up, Configuring, and Using an Oracle WebLogic Server Cluster Author and expert Yuli Vasiliev shows you how take advantage of multiple Oracle WebLogic Server instances grouped into a cluster to maximize scalability and availability. Webcast: Reduce Costs with Oracle's Database Storage Management Watch this! Join Oracle experts Kevin Jernigan and Margaret Hamburger for an interactive webcast in which you'll learn how Oracle's Database Storage Management can reduce storage costs and management complexity while improving query performance to meet service-level agreements and compliance requirements. Event Date: Tuesday, November 6, 2012 Event Time: 10 a.m. PT/1 p.m. ET Thought for the Day "Most software today is very much like an Egyptian pyramid with millions of bricks piled on top of each other, with no structural integrity, but just done by brute force and thousands of slaves." — Alan Kay Source: softwarequotes.com

    Read the article

  • Useful links

    - by Madhan ayyasamy
    Difference between STI vs Polymorphic associationsSTI vs PolymorphicDifference between habtm vs has_many :throughhabtm vs throughCapistrano Guide linkCapistrano guideRails application without database stuffclass Car < ActiveRecord::Baseself.abstract = trueendAnother link: rails without databaseNamed scope useful linkNamed scopeDifference between http and https verbhttp vs httpsRails 2.3 useful guide websiterails 2.3 guide

    Read the article

  • Free tools for SQL Server - Automating Execution Plan Analysis

    - by jchang
    Since this topic is being discussed, I will plug my own tools, SQL Exec Stats and (a little dated) documentation the main capability is cross-referencing index usuage with specific execution plans. another feature is generating execution plans for all stored procedures in a database, along with the index usage cross-reference. There are several sources of execution plans or plan handles, this could be a live trace, a previously saved trace, previously saved sqlplan files, from dm_exec_cached_plans,...(read more)

    Read the article

  • SSIS and Parallelism: The Unseen Minions

    Sometimes, a procedural database process cannot easily be reduced to a set-based algorithm in order to reduce the time it takes. Then, you have to find other ways to parallelise it. Other ways? Josef shows how to use SSIS to drastically reduce the time that such a process takes.

    Read the article

  • Defensive Error Handling

    TRY…CATCH error handling in SQL Server has certain limitations and inconsistencies that will trap the unwary developer, used to the more feature-rich error handling of client-side languages such as C# and Java. In this article, abstracted from his excellent new book, Defensive Database Programming with SQL Server, Alex Kuznetsov offers a simple, robust approach to checking and handling errors in SQL Server, with client-side error handling used to enforce what is done on the server.

    Read the article

  • Register Now to the New Oracle Argus Safety 7 Implementation Boot Camp - Tokyo, Japan - Dec 10-13, 2013!

    - by Roxana Babiciu
    Oracle's Argus Safety 7 boot camp is an instructor-led training course which provides a good understanding of how Oracle Argus Safety Standard Edition and Oracle Argus Safety Japan products addresses complex pharmacovigilance requirements and helps ensure global regulatory compliance by enabling sound safety decisions. Oracle Argus Safety's advanced database helps ensure global regulatory compliance thus in turn enabling sound safety decisions. Read more here. 

    Read the article

< Previous Page | 605 606 607 608 609 610 611 612 613 614 615 616  | Next Page >