Wednesday 3 August 2016

MGU-BOT

MGU-BOT

Hi all.. its been a long time since the last post. Sorry for the delay.. But now here I come up with a splendid bot which can do a lot of scraping magics.. This bot was used to scrape data from our university website where they publish results. There was no any way to get multiple results or batch wise examination results from the site. So I made a bot for that. What this bot actually does is, it just go to the website and get the result of one person and store it in a database. Then it goes for another and result and the same process continued till the loop ends. Then I analyzed the collected data and presented in a beautiful way. And you can see what I have done with that data here. Also you can download the source file of my bot from github.

[Note: Now this code can't be used to scrape data from the given url as they changed the structure of html document. This code can be used as a reference only]

Saturday 22 August 2015

Link Miner

Link miner bot

This bot will extract all the link from a fetched web content and displays it on the screen. Not only the word having the link, this bot will extract the URL to where it points. Here is the code..

import mechanize
def linkminer():                      
    browser = mechanize.Browser(factory=mechanize.RobustFactory())
    browser.set_handle_robots(False)
    browser.open("http://www.minerbots.blogspot.in/")  #you can give your url
    html = browser.response().readlines()
    for link in browser.links():
      print link.text, link.url
      print 
 

  Output will look like this..





Pattern searching bot..

Pattern-counter

Pattern counter or counter bot.. This is our first bot that we are going to build. And it's a simple bot which look for patterns or strings on a fetched web content. So lets start ...
Code:

import mechanize
def patterncounter():       #defining function
    count=0                          #declaring variable count for counting
    browser = mechanize.Browser(factory=mechanize.RobustFactory())
#initialising browser
    browser.set_handle_robots(False)
    browser.open("http://www.minerbots.blogspot.in/")  #opening URL  
    html = browser.response().readlines() #Fetching web contents
    for i in range(0,len(html)):   #Searching for pattern 'Vicz' line by line
      if 'Vicz' in html[i]:              
        count=count+1    
    print "%d No of times found"%count #analyzing and producing results    
This is the screen shot of output..
 

This counterbot count for the number of occurence of string 'Vicz' in the content fetched from URL 'http://www.minerbots.blogspot.in/'

Demo is over.. now do it youself.. ;)

Web mining

Web mining

I am not getting into the difficult definitions of web mining.. I only need you to get understand what really it is. So in simple words web mining is an art of extracting useful information from web(internet) by means of automating a system. Search engines are the biggest web miner's that we have seen.
There are different steps for mining data from web, they are

1)Collecting: We fetch the web content of the given URL from web.
2)Parsing: Extract the useful data from the fetched content
3)Analyzing: Analyze the extracted data and make it into useful information
4)Producing: Produce the results and store it in a file.

These 4 steps will be used in all the bots which we are going to build.

Friday 21 August 2015

About

What is Minerbots??

Minerbots is a project that provides you step by step guide for creating your own customized miner bots. We use python as the programming language here. By doing this project you will get understand what is web mining and you will create your own bots..:)

Why Minerbots??

There's no any other projects which can guide you for creating a webbot. So in here you will find a plenty of examples which mine data from web content.

Open Source

This project is open source and you will get full access to the source code of every bot for absolutely free..