Search Results

Search found 5416 results on 217 pages for 'urls py'.

Page 174/217 | < Previous Page | 170 171 172 173 174 175 176 177 178 179 180 181  | Next Page >

  • guide on crawling the entire web ?

    - by bohohasdhfasdf
    i just had this thought, and was wondering if it's possible to crawl the entire web (just like the big boys!) on a single dedicated server (like Core2Duo, 8gig ram, 750gb disk 100mbps) . I've come across a paper where this was done....but i cannot recall this paper's title. it was like about crawling the entire web on a single dedicated server using some statistical model. Anyways, imagine starting with just around 10,000 seed URLs, and doing exhaustive crawl.... is it possible ? I am in need of crawling the web but limited to a dedicated server. how can i do this, is there an open source solution out there already ? for example see this real time search engine. http://crawlrapidshare.com the results are exteremely good and freshly updated....how are they doing this ?

    Read the article

  • Python 3-compatibe HTML to text converter preserving basic structure under permissive licence?

    - by hawk64
    I am looking for a relatively simple HTML to text converter which displays links and works on strings. So far I have tried lynx but performance is too bad, html2text which gives weird and verbose markdown output and is under GPLv3 which is too restrictive for my (BSD-licensed) project, http://effbot.org/librarybook/formatter-example-3.py using htmllib.HTMLParser with formatter.AbstractFormatter and a custom writer, however htmllib.HTMLParser is drpeceated and has been removed from Python 3. So is there any simple, performant, Python 3-compatible HTML to text converter under a permissive license such as MIT/BSD/Apache and the like? Edit: I dont just need something to strip HTML-Tags but also to preserve the basic structure of the HTML, that is output that somewhat resembles that of Lynx.

    Read the article

  • cross domain DOM access and manipulation in Java ?

    - by gaqer
    In my Java app, how can I incorporate the browser (which loads and renders URLs) in Swing and access it's DOM and manipulate HTML ? How can you embed such browser in a Rich Internet Application and access it's DOM ? More specifically, Vaadin ? Is there a HTTP proxy class that can load an external URL, and render it to the user ? This was what I was doing on LAMP stack....but I want to switch to Vaadin or some Java web framework where I can just use Java to do everything from server-side to client-side logic design, so I can focus more on application logic. (aka looking for developer friendly frameworks like Vaadin). Thank you and have a great weekend !

    Read the article

  • Remove special chars from URL

    - by John Jones
    Hi, I have a product database and I am displaying trying to display them as clean URLs, below is example product names: PAUL MITCHELL FOAMING POMADE (150ml) American Crew Classic Gents Pomade 85g Tigi Catwalk Texturizing Pomade 50ml What I need to do is display like below in the URL structrue: www.example.com/products/paul-mitchell-foaming-gel(150ml) The problem I have is I want to do the following: Remove anything with braquets(and the braquets) Remove any numbers next to g or ml e.g. 400ml, 10g etc... I have been banging my head trying different string replaces but cant get it right, I would really appreciate some help. Cheers

    Read the article

  • Redirecting part of URL in Nginx

    - by Maca
    Trying to move my blog to a new site and I want to redirect some urls. I use nginx. https://blogurl.com/news/2014-08-19/post-3451/mt-preview-33e2742af1eb.php The /news/2014-08-19/post-3451/mt-preview-33e2742af1eb.php part would be always moving. Redirect to below: https://blogurl.com/content/news/2014-08-19/post-3451/mt-preview-33e2742af1eb.php I basically want to insert /content/ after https://blogurl.com and so far I have rewrite ^(.*)$ /content/ break; But my issue is my CMS sits on the same directory level https://blogurl.com/mt/admin and if I simply apply the rewrite above my CMS address would move too. How could I prevent this.

    Read the article

  • Porting library from Java to Python

    - by Mike Griffith
    I'm about to port a smallish library from Java to Python and wanted some advice (smallish ~ a few thousand lines of code). I've studied the Java code a little, and noticed some design patterns that are common in both languages. However, there were definitely some Java-only idioms (singletons, etc) present that are generally not-well-received in Python-world. I know at least one tool (j2py) exists that will turn a .java file into a .py file by walking the AST. Some initial experimentation yielded less than favorable results. Should I even be considering using an automated tool to generate some code, or are the languages different enough that any tool would create enough re-work to have justified writing from scratch? If tools aren't the devil, are there any besides j2py that can at least handle same-project import management? I don't expect any tool to match 3rd party libraries from one language to a substitute in another.

    Read the article

  • Google App Engine dev_appserver can't find PIL (I've installed it)

    - by goggin13
    I recently upgraded my Google App Engine launcher on my Mac, running OSX 10.5.8, and afterwards my projects that work with images stopped working locally. It seems to be the same problem that I had when first using GAE locally to work with images, before I installed PIL. Here is the error I get: SystemError: Parent module 'PIL' not loaded I have PIL installed. When I run python normally, I can access it and work with it as expected. I also checked to ensure that dev_appserver.py was running the same version of Python. If I open the interpreter and type sys.version I get this back: 2.5 (r25:51918, Sep 19 2006, 08:49:13) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] This is identical to what I get when I display the sys.version from my projects running through dev_appserver. Any thoughts on why dev_appserver can't find the PIL module? I have been banging my head against this for a bit. Thank you!

    Read the article

  • Retrieving a page of that has a redirect

    - by Dmitry Makovetskiyd
    I get my page content with this function: private function fetch_url($url){ $ch=curl_init($url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_TIMEOUT, 320); $this->doc = curl_exec($ch); $this->status_code= curl_getinfo($ch, CURLINFO_HTTP_CODE); // echo $this->doc; curl_close($ch); } The problem is that with some urls dont exist on a webpage and there is a redirect to another page.. So say if I put the parameter: http://example.com/uncategorized/ It redirect me to : http://example.com/mature/ The problem is with curl, I dont get any content.. But my aim is to get the content of that page redirect.. Is there an easy way to get the function to work in the way I want..?

    Read the article

  • How to handle recursive parent/child problems like this?

    - by lsdude
    In web dev I come across these problems a lot. For example, we have a giant list of URLs that are in this format: /businesses /businesses/food /businesses/food/wendys /businesses/food/wendys/chili /businesses/food/wendys/fries /businesses/food/wendys/chicken-nuggets /businesses/pharmacy/cvs /businesses/pharmacy/cvs/toothpaste /businesses/pharmacy/cvs/toothpaste/brand ... and then we need to output each one, where the parent category is in h1 tags, the child is in h2 tags, and the children of that are in h3 tags. I can handle this but I feel my code is messy. I'm sure there is a design pattern I can use? Langs are ruby/php usually. how would you handle this?

    Read the article

  • Can you make a python script behave differently when imported than when run directly?

    - by futuraprime
    I often have to write data parsing scripts, and I'd like to be able to run them in two different ways: as a module and as a standalone script. So, for example: def parseData(filename): # data parsing code here return data def HypotheticalCommandLineOnlyHappyMagicFunction(): print json.dumps(parseData(sys.argv[1]), indent=4) the idea here being that in another python script I can call import dataparser and have access to dataParser.parseData in my script, or on the command line I can just run python dataparser.py and it would run my HypotheticalCommandLineOnlyHappyMagicFunction and shunt the data as json to stdout. Is there a way to do this in python?

    Read the article

  • What is the best to format messages for queueing?

    - by Tijmen
    I've been reading up on message queueing lately, and I'd like to implement a simple, extendable, system for my app. While there's a lot of good information on the subject of setting up a MQ system out there, I can't find a lot about the actual implementation. I'm looking for patterns and best practices on how to properly format messages for a queue, and ways to execute the jobs in PHP. Should I use JSON, serialized objects, text, URLs or XML? What information should I send? Is a worker with a switch($job['command']) {} (or something like that) the way to go, or are there any established patterns out there to implement a worker? Help greatly appreciated!

    Read the article

  • Getting rid of index.php in the URL when using recess framework and lighttpd

    - by spudnik1979
    I am using the recess php framework with lighttpd Does anyone know how I can use the shorter urls of: http://www.myserver.com/recess Instead of: http://www.myserver.com/index.php/recess The recess readme file says that if I have mod_rewrite I can use the shorter url: -- "Do you have mod_rewrite? -- Yes: Open your browser to the location you unzipped -- No: Open your browser to the location you unzipped followed by index.php" I do have mod_rewrite enabled on lighttpd and i have removed the index.php but I get a 404. Do I need any special rules in my lighttpd.conf?

    Read the article

  • Using Python, How to copy files in 'temporary internet files' folder in Windows

    - by pythBegin
    I am using this code to find files recursively in a folder , with size greater than 50000 bytes. def listall(parent): lis=[] for root, dirs, files in os.walk(parent): for name in files: if os.path.getsize(os.path.join(root,name))>500000: lis.append(os.path.join(root,name)) return lis This is working fine. But when I used this on 'temporary internet files' folder in windows, am getting this error. Traceback (most recent call last): File "<pyshell#4>", line 1, in <module> listall(a) File "<pyshell#2>", line 5, in listall if os.path.getsize(os.path.join(root,name))>500000: File "C:\Python26\lib\genericpath.py", line 49, in getsize return os.stat(filename).st_size WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\Documents and Settings\\khedarnatha\\Local Settings\\Temporary Internet Files\\Content.IE5\\EDS8C2V7\\??????+1[1].jpg' I think this is because windows gives names with special characters in this specific folder... Please help to sort out this issue.

    Read the article

  • Django: Data corrupted after loading? (possible programmer error)

    - by Rosarch
    I may be loading data the wrong way. excerpt of data.json: { "pk": "1", "model": "myapp.Course", "fields": { "name": "Introduction to Web Design", "requiredFor": [9], "offeringSchool": 1, "pre_reqs": [], "offeredIn": [1, 5, 9] } }, I run python manage.py loaddata -v2 data: Installed 36 object(s) from 1 fixture(s) Then, I go to check the above object using the Django shell: >>> info = Course.objects.filter(id=1) >>> info.get().pre_reqs.all() [<Course: Intermediate Web Programming>] # WRONG! There should be no pre-reqs >>> from django.core import serializers >>> serializers.serialize("json", info) '[{"pk": 1, "model": "Apollo.course", "fields": {"pre_reqs": [11], "offeredIn": [1, 5, 9], "offeringSchool": 1, "name": "Introduction to Web Design", "requiredFor": [9]}}]' The serialized output of the model is not the same as the input that was given to loaddata. The output has a non-empty pre_req list, whereas the input's pre_reqs field is empty. What am I doing wrong?

    Read the article

  • listing objects from ManyToManyField

    - by Noam Smadja
    i am trying to print a list of all the Conferences and for each conference, print its 3 Speakers. in my template i have: {% if conferences %} <ul> {% for conference in conferences %} <li>{{ conference.date }}</li> {% for speakers in conference.speakers %} <li>{{ conference.speakers }}</li> {% endfor %} {% endfor %} </ul> {% else %} <p>No Conferences</p> {% endif %} in my views.py file i have: from django.shortcuts import render_to_response from youthconf.conference.models import Conference def manageconf(request): conferences = Conference.objects.all().order_by('-date')[:5] return render_to_response('conference/manageconf.html', {'conferences': conferences}) there is a model named conference. which has a class named Conferences with a ManyToManyField named speakers i get the error: Caught an exception while rendering: 'ManyRelatedManager' object is not iterable with this line: {% for speakers in conference.speakers %}

    Read the article

  • Check something before django server starts

    - by Vijay Shankar Kalyanaraman
    I am running my api behind a django server and say I have a one time token that is needed by the django application and used through out its existence until the process quits. To check if I can proceed and serve requests (using the django server) I need to validate this token against a database entry. Now, I can have a script that hits the db, then issues the run server command if the token is valid. But if the db used by the django applications change, I will have to change the script also to point to the same db. Is there a way I can pass this token into the runserver command as an additional parameter (along with hostname:port) and validate this before django serves any requests? How can I access this parameter that is sent into ./manage.py runserver. Thanks.

    Read the article

  • Django - raw_id_fields title not refreshing.

    - by James Howell
    Hi, I am currently having an issue when using the raw_id_field within admin.py in my Django project. My site's admin area has a number of image upload fields for various different model pages which are all ForeignKey fields to an Image model where all images for the site are stored. As the site will eventually be dealing with a large quantity of images (100s, maybe 1000s) the default select box would be unusable. I created various admin.ModelAdmin classes e.g class InfoSlideAdmin(admin.ModelAdmin): raw_id_fields=('image',) These change the image selector within my Edit pages from a Select Box to a Raw ID Field. However when I select a different image using this control although the ID of the new image is shown the title from the previous image still displays. Any ideas?

    Read the article

  • Make Codeigniter ignore directory

    - by Noah Goodrich
    I have Codeigniter installed and working for my main site. But I am now trying to add an add-on domain to the same hosting account, so I can have two sites running on the same hosting. Add-on domains make a new folder in the main public_html folder to store the web files. How can I get Codeigniter to ignore this directory? The site doesn't load properly when I try and view it. I have an SSL on the main site too and redirection for www URLS. Here's my .htaccess file: RewriteEngine on Options +FollowSymLinks RewriteBase / RewriteCond %{HTTP_HOST} ^www\.mysite\.co.uk$ [NC] RewriteRule ^(.*)$ http://mysite.co.uk/$1 [L,R=301] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*)$ /index.php/$1 RewriteCond %{HTTPS} off RewriteCond %{REQUEST_URI} (site|sections|here) RewriteRule ^(.*)$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L] RewriteCond %{HTTPS} onsite|sections|here) RewriteRule ^(.*)$ http://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]

    Read the article

  • How do I protect python code?

    - by Jordfräs
    I am developing a piece of software in python that will be distributed to my employer's customers. My employer wants to limit the usage of the software with a time restricted license file. If we distribute the .py files or even .pyc files it will be easy to (decompile), and remove the code that checks the license file. Another aspect is that my employer do not want the code to be read by our customers, fearing that the code may be stolen or at least the "novel ideas". Is there a good way to handle this problem? Preferably with an off-the-shelf solution. The software will run on Linux systems (so I don't think py2exe will do the trick)

    Read the article

  • server reboot has caused django project to lose directories

    - by wmfox3
    A fully functional Django project as well as a couple in development have all broken following the rebooting of the server. In addition to some pieces of the Django admin returning errors as well as missing .js and .css files, I'm getting errors like this when viewing pages that include images uploaded through the admin. Exception Type: TemplateSyntaxError Exception Value: Caught an exception while rendering: (2, 'No such file or directory') Exception Location: /usr/lib/pymodules/python2.6/django/template/debug.py in render_node, line 81 Python Executable: /usr/bin/python Python Version: 2.6.4 So did the reboot stomp on some part of my configuration/setup or did it fail to restart a critical piece?

    Read the article

  • Replacing backslashes in Python strings

    - by user323659
    I have some code to encrypt some strings in Python. Encrypted text is used as a parameter in some urls, but after encrypting, there comes backslashes in string and I cannot use single backslash in urllib2.urlopen. I cannot replace single backslash with double. For example: print cipherText '\t3-@\xab7+\xc7\x93H\xdc\xd1\x13G\xe1\xfb' print cipherText.replace('\\','\\\\') '\t3-@\xab7+\xc7\x93H\xdc\xd1\x13G\xe1\xfb' Also putting r in front of \ in replace statement did not worked. All I want to do is calling that kind of url: http://awebsite.me/main?param="\t3-@\xab7+\xc7\x93H\xdc\xd1\x13G\xe1\xfb" And also this url can be successfully called: http://awebsite.me/main?param="\\t3-@\\xab7+\\xc7\\x93H\\xdc\\xd1\\x13G\\xe1\\xfb" Any idea will be appreciated.

    Read the article

  • opening iWorks documents in iPad UIWebView

    - by user369156
    Hello, I'm writing an iPad application that has a UIWebView which I open word and excel documents in, but I want the user to be able to import those documents into the iWorks applications, Pages and Numbers, just like how you can do it in Safari if you open a document. If you open a document in Safari on the iPad, there'll be a button on the top bar that says "Open in..." and you can choose applications to open in. You get the top bar to appear by tapping on middle of the page. So is there an option you can set to allow UIWebView to show up the bar and automatically detect the content type and populate the list with applications you can import in? Or do I have to build this myself? And if I have to build my own, how do I open URLs to import documents into Pages and Numbers etc? Thanks, -David

    Read the article

  • Way around ASP.NET session being shared across multiple tab windows

    - by ace
    I'm storing some value in an asp.net session on the first page. On the next page, this session value is being read. However if multiple tabs are opened and there are multiple page 1-page 2 navigation going on, the value stored in session gets mixed up since the session is shared between the browser tabs. I'm wondering what are the options around this : Query String: Passing value between the pages using query string, I don't want to take this approach since there can be multiple anchor tags on page 1 linking to page 2 and I can not rewrite the URLs of each tag since they are dynamic. Cookies??? In-memory cookies are shared across browser tabs too, same as the session cookie, rite ? Any other option?

    Read the article

  • User controlled html title tags

    - by zaf
    What are the best practices for allowing a user to maintain the html title tags of all the major pages of his/her site? One way could be to allow the mapping of URLs to some text. For example, we have an app with the following (most complex) url format: http://lang.example.com/searchpage.zaf?a=foo&b=bar&c=RANDOM There are several parts to this: Language sub domain Search page Static parameter 'a' (user may want this in the title) Dynamic and relevant parameter 'b' (user may want this in the title) Dynamic parameter 'c' which can be ignored Never done this before, so I'm asking how you would tackle this!

    Read the article

  • ctypes import not working on python 2.5

    - by user551906
    Hi, I am trying to import ctypes, and I am using Python 2.5.5 installed using macports (on Mac OS X 10.6). I get an error saying "ImportError: No module named _ctypes" (see details below). As I understand it ctypes is supposed to come preinstalled for python 2.5. Any suggestions? thanks, Saurabh Error details: $ python Python 2.5.5 (r255:77872, Nov 30 2010, 00:05:47) [GCC 4.2.1 (Apple Inc. build 5659)] on darwin Type "help", "copyright", "credits" or "license" for more information. import ctypes Traceback (most recent call last): File "", line 1, in File "/opt/local/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ctypes/init.py", line 10, in from _ctypes import Union, Structure, Array ImportError: No module named _ctypes

    Read the article

< Previous Page | 170 171 172 173 174 175 176 177 178 179 180 181  | Next Page >