Offline Browser Features

Download websites for data mining
Data mining and data extraction are two essential processes that are used by researchers, journalists, business executives, and marketing analysts to gather valuable insights from large sets of data. The process of data mining involves the use of sophisticated algorithms to identify patterns and trends within a dataset. Data extraction, on the other hand, involves the process of collecting data from different sources such as websites, databases, and APIs, to name a few.
WebSite eXtractor is an offline browser software that enables users to download whole websites or parts of them to their computer. This tool is beneficial for researchers, journalists, students, equity analysts, and marketing executives who want to extract valuable data and digital images from websites. With WebSite eXtractor, users can browse the web at their convenience without worrying about slow page loading.
The process of data extraction with WebSite eXtractor is straightforward. Users can input the URL of the website they want to download, and the software will automatically download the entire site or specific parts of it, such as images or documents. Once the data is downloaded, users can extract the data they need and analyze it using various data analysis tools, such as DB Maker.
DB Maker is a powerful data mining tool that can be used to extract valuable insights from datasets.

With DB Maker, users can analyze data, identify patterns, and generate reports based on the information collected from websites using WebSite eXtractor. This tool is particularly useful for business executives and marketing analysts who want to gain a competitive advantage by analyzing data and trends in their industry.

WebSite eXtractor is an excellent tool for anyone who wants to extract valuable data from websites quickly and easily. With its user-friendly interface and advanced features, users can browse the web at their convenience, extract the data they need, and analyze it using various data

With Website eXtractor, you can:
Limit your search by domain types (.com, .net, .uk, etc.) by using the sophisticated filtering options based on a list of key words and other options.
Scan websites both online and offline (on your own hard drive) using the built-in browser.
Change html-links to relative names, allowing you to easily move information to another hard drive.
Select the documents you download by type and name using the superb filtering features.
Set download depth for websites (you can choose to download only the first few pages of a site and weed out the material you don't need).
Ability to download a large number of websites to your PC, which means you don't have to click your mouse a hundred times or more when you want to save files in a folder or directory. (You can go and get a coffee and when you return all the files you need with be neatly downloaded into the folder of your choice - ready to view offline).
Conveniently browse graphic-heavy websites that take an eternity to load (no more frustration).
Great for viewing websites with photo albums or galleries. Even high-speed connections take a long time to download graphics. And, the truth is, viewing such sites is far easier when done offline.
Create databases from downloaded websites using the DB Maker program.
Make an exact copy of your own business or personal website and transfer it to the server of another provider - very handy feature for webmasters.
Offline Browser General settings
Before running the program it is advisable to adjust the general settings.
To do this, launch the program and choose Default Options.

The first thing to do is decide which directory (new path) you will use to save project files and the path to the directory for saving files copied (downloaded) from the Internet.

Let's take a look at the various options available.

Download and overwrite all files - this option is used to download files onto your hard drive.
Follow new links / URL - this option allows you to automatically extract other websites linked to the one you are scanning.
Stay within initial domain list.
A very convenient option that allows you to extract (not download) hyperlinks (websites) not included in the original list of addresses. Here you should decide whether you need to download other websites referred to from one you are downloading. Using this option you will only download the files you order. In this case the sites linked to the one you are investigating will also be downloaded. For example, you only need to download a list of (URL) addresses
www.internet-soft.com
www.softwarea.com

and you don't need to download other domains linked to the original list of domains (e.g. internet-soft.com)
Extract local link - to search for local hyperlinks. This option allows you to search for local links on the website you are scanning, i.e. links that refer to other documents on the website.
Extract only external link - to search for external hyperlinks.
This option allows you to download only linked pages, i.e. links that refer to other web domains and websites.
Links level limit - number of downloading levels - shows the number of steps involved in the hyperlinks.

An example will help to illustrate this option. Let's assume there is a hyperlink from one site to another. There is a link from the second link to the third, etc. As you can see, a number of hyperlinks must be followed to get from one site to another. This option gives you the greatest possible number of hyperlink steps. Each step enables you to make some hyperlinks with a number of other websites. So if you have selected only one level, you will only be able to copy the websites (let's call them XI websites) to which there is a link on the website you are downloading (scanning), and not the sites with hyperlinks from XI websites.
    ftp client

    The following chart shows how the links level limit works
    Number of connections
    In this item you enter the number of simultaneous connections. As a rule 5 - 10 connections are made. The optimal number of connections will depend on the number of lines you have and the connection speed of your provider.
    Save results automatically
    To save your results automatically every N of minutes. This option shows how frequently your interim search results are to be saved.
    Time out for one connection
    This option gives the maximum amount of time in seconds during which each document (one connection) is downloaded.
    At the end of this time the program starts downloading the next document.
    Number of retries
    The number of attempts made to download each document.
    This option shows the number of attempts to download the same file if the provider connection or website link is broken off. The program will make as many attempts to download as you specify.
    Copy subdirectory structure from website
    - to copy the structure of a subdirectory from the website you wish to download. If this option is highlighted your hard drive will be able to create directories like the ones on the website you are downloading.
    Apply domainname.com=www.domainname.com
    In some sites the hyperlinks to other sites contain no original www symbols and when the same documents are downloaded they may be inscribed twice in different directories. This option is designed to deal with this anomaly in Internet sites. If you highlight this option INTERNET-SOFT.COM and WWW.INTERNET-SOFT.COM will be treated as synonymous addresses. The address is automatically prefixed as www in this type of search.
    Expand the nodes parents to make the node visible
    This convenience option is intended to graphically represent the tree of websites scanned. In this way the option shows the current branches of the site being downloaded and enables the program to graphically depict the locations where sites are downloaded.
    Identify browser as
    This option shows how the program will be identified when the website is downloaded by a remote server.
    For example, when you download a page using Internet Explorer 5.0, the remote server performs this operations and writes the contents of the server as a protocol. The Extractor program does the same thing when you visit a website.
    Proxy Server Enter the proxy server properties (if you use a proxy server).
    Then choose any other options you would like to use in downloading and searching for hyperlinks.
    We would like to draw your attention to the following:
    Since the worldwide web contains a huge number of pages great data processing power may be needed as well as a large amount of disk space on your computer to download links and websites. A few hours of work by the program may take up many gigabytes on your hard disk.
    File Type Filter: Limiting the types and sizes of files
    You can use this option to specify the types of files you want to download and limit their size.
    This is important, for example, when you only want to download text documents without banners, pictures or archive files.
    In this case, check the option beside html, htm, txt and shtml, etc. files. You can use these menu options to limit the size of files to be downloaded. If you have selected "Load all file sizes", files of all sizes will be downloaded. Otherwise you will only get the sizes (specified in bytes) you have selected.
    URL / Domain Filter: Limitations by names of directories, domain names and files.
    You can make limitations by entering certain words in domains. Let's say you're downloading files only from https://www.internet-soft.com.
    You would only enter internet-soft as the filter word.
    The filter can be used separately:
    • to adjust the word content in a domain name;
    • to expand the domain;
    • to adjust the contents of a certain word in a directory name;
    • to modify any given word in the file name.
    The filter can be used to include and exclude. If you have entered words into the exclude filter, this means that if the URL contains any of these words, the corresponding files will not be downloaded. If you opt for the include filter, this means that only the names containing the properties specified in the word filter will be downloaded.
    Domains: Limitations by domain type.
    This option enables you to make limitations by type and country of the domain.
    To do this click on the requested domain type. This is all you have to do for the main program settings.
    When you exit the menu window you save by default the data you have entered and you can proceed to download websites.
    Now we can start a project. The default properties you have entered will automatically be called up when you start a new project. These properties can be altered and saved for a later time for each separate project.
    The term "project" therefore refers to the total number of options that define which site and properties are to be downloaded.

    Downloading a website

    In order to download the website you need into your hard drive, first create a downloading project.
    • Select Project on the main menu and then New. A window will appear for viewing websites and entering download properties;
    • Now enter the address of the site you would like to download;
    • Press Download / Extract.
    The lower panel will show the pages of the website which is being downloaded; After this the websites are downloaded to your computer. Let's now take a more detailed look at the main control panels used in a downloading project.
    Site Map – Structure of the Site
    On the left-hand side panel of the program you can see a map of the site which is to be downloaded according to the links created using the program. You can also use the local menu, which is called up by clicking on the right button on your mouse. The structure of the site is created automatically only if you choose the option "Follow new links".
    You can also delete unnecessary links from the site structure or paste them onto the clipboard. Other properties can be set by selecting options or project options on the menu list. For example, using the latter item you can make customized filter settings and downloading level options
      ftp client
      Online / Offline Preview On the right-hand side you can see the window for viewing web pages. This window can run either
      • Online
      • Offline
      This browser window allows you to perform the same operations as with an ordinary browser: surfing and using links to move around. Use the right-hand mouse key to call up the local menu. You can move the browser window or remove it through the main menu of the program. You can view pages both offline and online while you download other documents.
      You can also copy a link into the clipboard and the paste it:
      • Into the list of sites to be downloaded;
      • Into the site map.
      Other setting options
      The program also offers a range of other frequently-used options.
      If you have a text file with a list of websites, you can download this file into the site window using the load button.

      ftp client

      Get Offline Browser

      Website Extractor is one of the fastest known website downloaders available today
      Once you have downloaded websites using Website eXtractor, you can perform data mining and data extraction using other tools such as DB Maker.
      ftp client