4. API Reference¶
4.1. Data Fetchers¶
Stock Data¶
Other Data¶
Fundamental Data¶
-
class
factorset.data.FundCrawler.
FundCrawler
(TYPE)[source]¶ FundCrawler类,协程爬取基本面数据
-
fetch
(queue, session, url, ticker)[source]¶ 单个ticker基本面爬取
Parameters: - queue – ticker 队列
- session – aiohttp.ClientSession()
- url – 股票基本面爬取地址
- ticker – 股票代码
Returns: 基本面数据text
-
main
(Ashare, num=10, retry=2)[source]¶ 协程爬取主程序
Parameters: - Ashare (list) – 带爬取tickers
- num (int) – 最大协程数
- retry (int) – 重启次数
Returns: None
-
4.2. Data Reader¶
The following methods are available for use in the prepare_data
(recommended), generate_factor
API functions.
Stock Data¶
-
factorset.data.CSVParser.
all_stock_symbol
(dir)[source]¶ Parameters: dir (string) – 数据路径 Returns: 路径下所有股票tickers
-
factorset.data.CSVParser.
read_stock
(dir, ticker)[source]¶ Parameters: - dir (string) – 数据路径
- ticker – 单个股票ticker
Returns: 单个股票行情, pd.DataFrame
-
factorset.data.CSVParser.
concat_stock
(dir, tickers)[source]¶ - 纵向合并目录指定股票行情
Parameters: - dir (string) – 数据路径
- tickers – 股票tickers, list
Return type: pd.DataFrame
Other Data¶
Fundamental Data¶
-
factorset.data.CSVParser.
all_fund_symbol
(dir, type)[source]¶ 获取储存路径中一种报表的所有tickers
Parameters: - dir (string) – 数据路径
- type – BS’,’IS’,’CF’
Returns: tickers
Return type: list
-
factorset.data.CSVParser.
read_fund
(dir, type, ticker)[source]¶ 读取一个股票的一种报表数据
Parameters: - dir (string) – 数据路径,string
- type – BS’,’IS’,’CF’
- ticker – 股票ticker, str
Return type: pd.DataFrame
4.3. Data Util¶
-
factorset.data.OtherData.
code_to_symbol
(code)[source]¶ 生成symbol代码标志
Parameters: code – 数字 Returns: str,股票代码
-
factorset.data.OtherData.
shift_date
(date_str, n)[source]¶ Parameters: - date_str – 日期, ‘YYYYMMDD’格式的字符串
- n – 时间跨度, int
Returns: 调整后的交易日,date
-
factorset.Util.finance.
ttmContinues
(report_df, label)[source]¶ Compute Trailing Twelve Months for multiple indicator.
- computation rules:
- ttm indicator is computed on announcement date.
- on given release_date, use the latest report_date and the previous report year for computation.
- if any report period is missing, use weighted method.
- if two reports (usually first-quoter and annual) are released together, only keep latest
Parameters: - report_df (Pandas.DataFrame) – must have ‘report_date’, ‘release_date’, and <label> columns
- label (str.) – column name for intended indicator
Returns: columned by [‘datetime’, ‘report_date’, <label>+’_TTM’, …]
Return type: Pandas.DataFrame
Todo
if announce_date exist, use announce_date instead of release_date, report_date as well
-
factorset.Util.finance.
ttmDiscrete
(report_df, label_str, min_report_num=4)[source]¶ Parameters: - report_df (Pandas.DataFrame) – must have ‘report_date’, ‘release_date’, and <label> columns
- label_str –
- min_report_num (int) –
Returns: columned by [‘datetime’, ‘report_date’, <label>+’_TTM’, …]
Return type: pd.DataFrame