## Install dependencies

Install dependencies like this:
> pip install -r requirements.

## Run the program

Run like this:
> Install docker and pull 'xingyaoww/codeact-executor'
> bash ./src/code_execution/start_jupyter_server.sh 8082
> python ./run_main.py 

## updates:
- 2024-04-06: added sidebar for selecting API type, inputting local API URL, inputting model name, temperature and max_tokens
- 2024-04-07: added sidebar for the selection local LLM type, added selection of model for the OLLAMA serve. 
- 2024-04-07: added auto list models when the OLLAMA serve is used. 
- 2024-04-08: added model selection when ChatGPT(openai) is selected. 
- 2024-04-09: Added the input for the number of generated python programs in one click
- 2024-04-09: Added the input for the number of generated python programs in one click
- 2024-04-10: Added  language support: C++, C, fortran
- 2024-04-13: Move main.py, app.py, codeagent.py and utils.py to src folder.
- 2024-04-13: Add run_main.py, in which python_path is set. 
- 2024-04-13: Complied the program to .exe by using Pyinstaller. 
- 2024-04-14: Added the input position to store programs 
- 2024-04-14：Added restart button and stop button. Mainly for executable .exe user.
- 2024-04-14：Added picker folder selection to store the generated program files. 
- 2024-04-14：Added file analysis and query routine. Too slow.
- 2024-04-16：Added cache data for embedding files because chat_input will cause web refresh and re-running of all python codes 
- 2024-04-18: When there is a ParseError, regenerate code.
- 2024-04-18：In the codewriter part, allow user to give data in csv format and ask question to write code to analyze the data or build ML models. 
- 2024-04-19: Release Ver 6.
- 2024-04-20: Build a virtual env to execute the generated code, rather than directly code exec(). Purpose: 1. safty; 2. Hope that exe file by Pyinstaller can install packages. 
- 2024-04-20: The code must be executed at first by using `run ./run_main.py` to generate a venv called MECHInfo
- 2024-04-22: Not work yet.  
- 2024-04-22: At last, I use docker and jupyterkernel to perform python executation. Everything works fine now.    


Release of Ver 7 delayed
Release of Ver 8 

- 2024-04-30: Add only Chat session.

Release of Ver 9 

- 2024-05-09: Add streaming output for codewriter, using st.expander() and content=expander.write_stream(langchain_stream)
- 2024-05-09: Add streaming output for DocAnalyzer, using  StreamlitCallbackHandler(st.container(), collapse_completed_thoughts=False).

Release of Ver 10 

- 2024-05-14 If upload data to perform ML, using MLMD vector db as rag.
- 2024-05-19 Using embedding filter or llmchainfilter to compress the context from MLMD due to so many st.xxxx.

Release of Ver 11 

- 2024-05-27 Multimodal support in chatonly. Llava and MiniCPM

Release of Ver 12 

- 2024-05-31  Add langchain summary chain to DocAnalyzer.
- 2024-06-01  Add Rapidapi support for image analysis. The basic plan of Rapidapi has only 10 entrance per month. 基本上没啥用。 
- 2024-06-05  Add SchemaScholar search based on chat history in DocAnalyzer: search_scholar.py
- 2024-06-05  Image analysis showing only on Multimodal model selected.

Release of Ver 13 

- 2024-06-10  Revised SchemaScholar search based on chat history in both chat and DocAnalyzer: search_scholar.py. Worked now.
- 2024-06-11  Add document refiner.
- 2024-06-12  Add DeepSeek as one interface 
- 2024-06-18  Add new function that Translates English paper to Chinese.
- 2024-06-26  Add ArXiv search based on chat history in DocAnalyzer: search_scholar.py

Release of Ver 14 

- 2024-07-05  Add IFly Spark as one interface, Its lite domain is free forever.
- 2024-07-06  Add code writing and execution check in the chat and DocAnalyzer part.
- 2024-07-06  Change toggle to button in summation, codewrite and websearch

Release of Ver 15 

- 2024-07-18  Add pdf preview in translation. change pdf parsing to pdfminer.six 

Release of Ver 16 

- 2024-07-28  Add MinerU as an alternative pdf parser. You must install MinerU and make it work in a shell at first. 
- 2024-08-21  Optimize the translation: If the pdf is already parsed or translated, the result will be printed directly without rerun the whole process.

Release of Ver 18 


## Todo list
- 2024-04-14：optimize file analysis and query routine: using Vector database,  memorizing names of loaded files.
- 2024-04-17：Allow input URL for web context analysis. 
- 2024-08-21：Add graphRAG


## Some considerations

- 2024-04-17: 初步设想怎么编写矢量数据库的上传文件：先通过LLM对上传文件进行分析，提取关键字，按关键字存储？