How you find the list of FNO companies?

Amit Ghosh
I used to scarp this link twice a month https://www.nseindia.com/content/fo/fo_underlyinglist.htm but NSE has made some changes. So I made this -


#Srapping the Site
import requests
session = requests.Session()
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,hi;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'
}
response=session.get('http://zerodha.com/margin-calculator/Futures/')
response= (response.content)
#print(response)

#Extracting the Table
response = str(response,"utf-8").split('
')[1]
response = response.split('
  • tahseen
    Why are you doing all this scrapping ? And if you were, scrapping without using BeautifulSoup ?

    Anyway, why don't you just use Instrument file that Zerodha provides everyday in the morning ? It has it all. If you have a Zerodha Kite account
  • tahseen
    By the way if you still want to extract FnO from NSE, it is working
    import requests
    from bs4 import BeautifulSoup

    headers = {'user-agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) \
    AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'}

    res=requests.get('https://www.nseindia.com/content/fo/fo_underlyinglist.htm', headers=headers)
    soup = BeautifulSoup(res.content,'lxml')
    rows = soup.find_all('td', attrs = {'class':'t0'})

    # If you want indices also then replace rows[6:] with just rows
    fno = [row.find('a').text for row in rows[6:] if row.find('a').text in row.find('a').get('href')]
    Now fno has list of FNO Stocks
  • Amit Ghosh
    See I already wrote
    I used to scarp this link twice a month https://www.nseindia.com/content/fo/fo_underlyinglist.htm but NSE has made some changes
    The problem is
    res=requests.get('https://www.nseindia.com/content/fo/fo_underlyinglist.htm', headers=headers)
    doesn't work all the time. NSE blocks requests abruptly.

    So failsafe.
  • Amit Ghosh
    On BeautifulSoup, It is a very messy process time enhancer. I literally skip it at every possible way possible. Most work anyways get done because Pandas have table selector frame and String Strips.

    Here is what I use on that -
    import requests
    import pandas as pd

    tables=pd.read_html(requests.get('https://www.nseindia.com/content/fo/fo_underlyinglist.htm').content)[3].iloc[:, 1].drop([4])
    tables will contain the FNO data.
  • tahseen
    @Amt,

    1. Yes, pd.read_html is pretty simple, personally I would avoid BeautifulSoup unless absolutely necessary
    2. I detailed BeautifulSoup approach in context to your first code

    Also, blocking my NSE is not basis headers value

    I don't use BeautfilSoup in any of my trading code
  • tahseen
    *Also, blocking by NSE is not basis headers value
  • Amit Ghosh
    No No, I meant NSE blocks scrapping to their server sometimes. (Happens randomly; I am not talking about IP Block which they do if you fetch within 3 minutes multiple time..)

    I tried to tweak with the headers value in the request module. But all in vain. NSEPY, NSETOOLS. Nothing works that time. Maybe cookie; I will keep looking.
  • adjas
    adjas edited December 2019
  • adjas
    @Amit Ghosh Here's the code in case you want to use it. You'd need to download chromedriver and pass it's path to the code .... also set your download directory.
    from time import sleep
    from selenium import webdriver
    import os

    URL_SOS_SCHEME = 'https://www.nseindia.com/content/fo/sos_scheme.xls'
    URL_FO_MARKET_LOTS = 'https://www.nseindia.com/content/fo/fo_mktlots.csv'

    def download_fno_metadata():
    options = webdriver.ChromeOptions()
    preference = {"profile.default_content_settings.popups": 0,
    "download.default_directory": r"C:\<path to download>\\",
    "directory_upgrade": True}
    options.add_experimental_option("prefs", preference)
    browser = webdriver.Chrome(executable_path=settings.PATH_CHROME_DRIVER, options=options)

    try:
    os.remove(settings.<path to download> + 'sos_scheme.xls')
    os.remove(settings.<path to download> + 'fo_mktlots.csv')
    except OSError:
    pass

    browser.get(settings.URL_SOS_SCHEME)
    sleep(10)
    browser.get(settings.URL_FO_MARKET_LOTS)
    sleep(10)
    browser.close()
  • tahseen
    tahseen edited December 2019
    @adjas unnecessarily selenium used
  • adjas
    @tahseen those two files have not just the name but also all the other details you'd need for fno contracts (both options and futures). How you'd download it and how you'd parse it is upto you
  • tahseen
    @adjas
    requests and pandas will do the magic


    import requests
    from io import StringIO
    import pandas as pd

    URL_FO_MARKET_LOTS = 'https://www.nseindia.com/content/fo/fo_mktlots.csv'

    data = requests.get(URL_FO_MARKET_LOTS).content

    df = pd.read_csv(StringIO(data.decode('utf-8')))

    print(df.tail())

  • Amit Ghosh
    @adjas Thanks man. Didn't knew about those two files earlier.
  • krtrader
    one of the simple way is to use kite instrument api
Sign In or Register to comment.