Skip to content

Commit 370d802

Browse files
authored
Merge pull request #104 from SubramanyamChalla24/backend_ml
Integrated Cohere with Qdrant to get similarity scores.
2 parents 076125f + 8113b8f commit 370d802

File tree

5 files changed

+305
-88
lines changed

5 files changed

+305
-88
lines changed

README.md

+53-36
Original file line numberDiff line numberDiff line change
@@ -57,65 +57,65 @@ Follow these steps to set up the environment and run the application.
5757

5858
2. Clone the forked repository.
5959

60-
```bash
61-
git clone https://github.com/<YOUR-USERNAME>/Resume-Matcher.git
62-
cd Resume-Matcher
63-
```
60+
```bash
61+
git clone https://github.com/<YOUR-USERNAME>/Resume-Matcher.git
62+
cd Resume-Matcher
63+
```
6464

6565
3. Create a Python Virtual Environment:
6666

67-
- Using [virtualenv](https://learnpython.com/blog/how-to-use-virtualenv-python/):
67+
- Using [virtualenv](https://learnpython.com/blog/how-to-use-virtualenv-python/):
6868

69-
_Note_: Check how to install virtualenv on your system here [link](https://learnpython.com/blog/how-to-use-virtualenv-python/).
69+
_Note_: Check how to install virtualenv on your system here [link](https://learnpython.com/blog/how-to-use-virtualenv-python/).
7070

71-
```bash
72-
virtualenv env
73-
```
71+
```bash
72+
virtualenv env
73+
```
7474

75-
**OR**
75+
**OR**
7676

77-
- Create a Python Virtual Environment:
77+
- Create a Python Virtual Environment:
7878

79-
```bash
80-
python -m venv env
81-
```
79+
```bash
80+
python -m venv env
81+
```
8282

8383
4. Activate the Virtual Environment.
8484

85-
- On Windows.
85+
- On Windows.
8686

87-
```bash
88-
env\Scripts\activate
89-
```
87+
```bash
88+
env\Scripts\activate
89+
```
9090

91-
- On macOS and Linux.
91+
- On macOS and Linux.
9292

93-
```bash
94-
source env/bin/activate
95-
```
93+
```bash
94+
source env/bin/activate
95+
```
9696

9797
5. Install Dependencies:
9898

99-
```bash
100-
pip install -r requirements.txt
101-
```
99+
```bash
100+
pip install -r requirements.txt
101+
```
102102

103103
6. Prepare Data:
104104

105-
- Resumes: Place your resumes in PDF format in the `Data/Resumes` folder. Remove any existing contents in this folder.
106-
- Job Descriptions: Place your job descriptions in PDF format in the `Data/JobDescription` folder. Remove any existing contents in this folder.
105+
- Resumes: Place your resumes in PDF format in the `Data/Resumes` folder. Remove any existing contents in this folder.
106+
- Job Descriptions: Place your job descriptions in PDF format in the `Data/JobDescription` folder. Remove any existing contents in this folder.
107107

108108
7. Parse Resumes to JSON:
109109

110-
```python
111-
python run_first.py
112-
```
110+
```python
111+
python run_first.py
112+
```
113113

114114
8. Run the Application:
115115

116-
```python
117-
streamlit run streamlit_app.py
118-
```
116+
```python
117+
streamlit run streamlit_app.py
118+
```
119119

120120
**Note**: For local versions, you do not need to run "streamlit_second.py" as it is specifically for deploying to Streamlit servers.
121121

@@ -127,12 +127,29 @@ Follow these steps to set up the environment and run the application.
127127

128128
1. Build the image and start application
129129

130-
```bash
131-
docker-compose up
132-
```
130+
```bash
131+
docker-compose up
132+
```
133133

134134
2. Open `localhost:80` on your browser
135135

136+
### Cohere and Qdrant
137+
138+
1. Visit [Cohere website registration](https://dashboard.cohere.ai/welcome/register) and create an account.
139+
2. Go to API keys and copy your cohere api key.
140+
3. Visit [Qdrant website](https://cloud.qdrant.io/) and create an account.
141+
4. Get your api key and cluster url as well
142+
5. Now create a yaml file named config.yml in Scripts/Similarity/ folder.
143+
6. The format for the conifg file should be as below:
144+
```yaml
145+
cohere:
146+
api_key: cohere_key
147+
qdrant:
148+
api_key: qdrant_api_key
149+
url: qdrant_cluster_url
150+
```
151+
7. Please replace your values without any quotes.
152+
136153
<br/>
137154

138155
<div align="center">

archive/resume_matcher.ipynb

+40-36
Original file line numberDiff line numberDiff line change
@@ -17,32 +17,11 @@
1717
"cells": [
1818
{
1919
"cell_type": "code",
20-
"execution_count": null,
20+
"execution_count": 5,
2121
"metadata": {
22-
"colab": {
23-
"base_uri": "https://localhost:8080/"
24-
},
25-
"id": "aHoRFk4LpFSZ",
26-
"outputId": "0a950106-ea2a-498a-9dcc-e99458b1f139"
22+
"id": "aHoRFk4LpFSZ"
2723
},
28-
"outputs": [
29-
{
30-
"output_type": "stream",
31-
"name": "stdout",
32-
"text": [
33-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m44.5/44.5 kB\u001b[0m \u001b[31m1.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
34-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.7/2.7 MB\u001b[0m \u001b[31m30.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
35-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m132.5/132.5 kB\u001b[0m \u001b[31m1.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
36-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.6/2.6 MB\u001b[0m \u001b[31m11.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
37-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.4/75.4 kB\u001b[0m \u001b[31m6.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
38-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m304.5/304.5 kB\u001b[0m \u001b[31m12.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
39-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m74.5/74.5 kB\u001b[0m \u001b[31m6.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
40-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m57.5/57.5 kB\u001b[0m \u001b[31m5.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
41-
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m5.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
42-
"\u001b[?25h"
43-
]
44-
}
45-
],
24+
"outputs": [],
4625
"source": [
4726
"!pip install cohere --quiet\n",
4827
"!pip install qdrant-client --quiet"
@@ -55,14 +34,27 @@
5534
"from qdrant_client import QdrantClient, models\n",
5635
"from qdrant_client.http.models import Batch\n",
5736
"import cohere\n",
37+
"\n",
5838
"def read_config(filepath):\n",
59-
" with open(filepath) as f:\n",
60-
" config = yaml.safe_load(f)\n",
61-
" return config\n",
39+
" try:\n",
40+
" with open(filepath) as f:\n",
41+
" config = yaml.safe_load(f)\n",
42+
" return config\n",
43+
" except FileNotFoundError as e:\n",
44+
" print(f\"Configuration file {filepath} not found: {e}\")\n",
45+
" except yaml.YAMLError as e:\n",
46+
" print(f\"Error parsing YAML in configuration file {filepath}: {e}\", exc_info=True)\n",
47+
" except Exception as e:\n",
48+
" print(f\"Error reading configuration file {filepath}: {e}\")\n",
49+
" return None\n",
50+
"\n",
6251
"\n",
6352
"class QdrantSearch:\n",
6453
" def __init__(self, resumes, jd):\n",
6554
" config = read_config(\"config.yml\")\n",
55+
"\n",
56+
"\n",
57+
"\n",
6658
" self.cohere_key = config['cohere']['api_key']\n",
6759
" self.qdrant_key = config['qdrant']['api_key']\n",
6860
" self.qdrant_url = config['qdrant']['url']\n",
@@ -128,31 +120,34 @@
128120
"metadata": {
129121
"id": "SXOgwcCATtww"
130122
},
131-
"execution_count": null,
123+
"execution_count": 6,
132124
"outputs": []
133125
},
134126
{
135127
"cell_type": "code",
136128
"source": [
137129
"resumes = [\"Professional Summary Highly skilled MERN Stack Developer with over 10 years of experience specializing in designing building and maintaining complex web applications Proficient in MongoDB Expressjs React and Nodejs Currently contributing to the development of AI technologies at OpenAI with a primary focus on the ChatGPT project Skills JavaScript and TypeScript MongoDB Expressjs React Nodejs MERN stack RESTful APIs Git and GitHub Docker and Kubernetes Agile and Scrum Python and Machine Learning basics Experience June 2020 PresentMERN Stack Developer OpenAI San Francisco USA Working on the development of the ChatGPT project using Nodejs Expressjs and React Implementing RESTful services for communication between frontend and backend Utilizing Docker and Kubernetes for deployment and management of applications Working in an Agile environment delivering highquality software every sprint Contributing to the design and implementation of machine learning algorithms for natural language processing tasks July 2015 May 2020Full Stack Developer Uber San Francisco USA Developed and maintained scalable web applications using MERN stack Ensured the performance quality and responsiveness of applications Successfully deployed solutions using Docker and Kubernetes Collaborated with a team of engineers product managers and UX designers Led a team of junior developers conducted code reviews and ensured adherence to best coding practices Worked closely with the data science team to optimize recommendation algorithms and enhance user experience June 2012 June 2015Software Developer Facebook Menlo Park USA Developed features for the Facebook web application using React Ensured the performance of the MongoDB databases Utilized RESTful APIs for communication between different parts of the application Worked in a fastpaced testdriven development environment Assisted in migrating the legacy system to a modern MERN stack architecture Education 2009 2012 PhD in Computer Science CalTech Pasadena USA 2007 2009 Master of Science in Computer Science MIT Cambridge USA 2003 2007 Bachelor of Science in Computer Science UC San Diego San Diego USA 1/2 Projects 2019 PresentPersonal Project Gotham Event Planner Created a fullfeatured web application to plan and organize events in Gotham city Used MERN stack for development and Docker for deployment The application allows users to create manage and share events and integrates with Google Maps API to display event locations 2/2\"]\n",
138130
"job_description = \"Job Description Java Developer 3 Years of Experience Tech Solutions San Francisco CA USA About Us At Tech Solutions we believe in the power of technology to solve complex problems We are a dynamic forwardthinking tech company specializing in custom software solutions for various industries We are seeking a talented and experienced Java Developer to join our team Job Description We are seeking a skilled Java Developer with at least 3 years of experience in building highperforming scal able enterprisegrade applications You will be part of a talented software team that works on missioncritical applications Your roles and responsibilities will include managing Java/Java EE application development while providing expertise in the full software development lifecycle Responsibilities •Designing implementing and maintaining Java applications that are often highvolume and low latency required for missioncritical systems •Delivering high availability and performance •Contributing to all phases of the development lifecycle •Writing welldesigned efficient and testable code •Conducting software analysis programming testing and debugging •Ensuring designs comply with specifications •Preparing and producing releases of software components •Supporting continuous improvement by investigating alternatives and technologies and presenting these for architectural review Requirements •BS/MS degree in Computer Science Engineering or a related subject •Proven handson Software Development experience •Proven working experience in Java development •Handson experience in designing and developing applications using Java EE platforms •ObjectOriented Analysis and design using common design patterns •Profound insight of Java and JEE internals Classloading Memory Management Transaction man agement etc 1 •Excellent knowledge of Relational Databases SQL and ORM technologies JPA2 Hibernate •Experience in developing web applications using at least one popular web framework JSF Wicket GWT Spring MVC •Experience with testdriven development Benefits •Competitive salary package •Health dental and vision insurance •Retirement savings plan •Professional development opportunities •Flexible work hours Tech Solutions is proud to be an equal opportunity employer We celebrate diversity and are committed to creating an inclusive environment for all employees How to Apply To apply please submit your resume and a brief explanation of your relevant experience to 2\"\n",
131+
"config = read_config(\"config.yml\")\n",
132+
"if not config:\n",
133+
" print(\"Cannot process this as there is no config.yml\")\n",
134+
"else:\n",
135+
" qdrant_search = QdrantSearch(resumes, job_description)\n",
139136
"\n",
140-
"qdrant_search = QdrantSearch(resumes, job_description)\n",
137+
" qdrant_search.update_qdrant()\n",
141138
"\n",
142-
"qdrant_search.update_qdrant()\n",
143-
"\n",
144-
"results = qdrant_search.search()\n",
145-
"for r in results:\n",
146-
" print(r)"
139+
" results = qdrant_search.search()\n",
140+
" for r in results:\n",
141+
" print(r)"
147142
],
148143
"metadata": {
149144
"colab": {
150145
"base_uri": "https://localhost:8080/"
151146
},
152147
"id": "rlP3s5euo435",
153-
"outputId": "3f4f15b6-d446-4491-d4d5-d9ba14a2a145"
148+
"outputId": "389c00e7-8cd1-4dd6-f517-d923e3c4bf2a"
154149
},
155-
"execution_count": null,
150+
"execution_count": 10,
156151
"outputs": [
157152
{
158153
"output_type": "stream",
@@ -162,6 +157,15 @@
162157
]
163158
}
164159
]
160+
},
161+
{
162+
"cell_type": "code",
163+
"source": [],
164+
"metadata": {
165+
"id": "WFdXngZkEyOm"
166+
},
167+
"execution_count": null,
168+
"outputs": []
165169
}
166170
]
167171
}

requirements.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -108,4 +108,5 @@ wasabi==1.1.2
108108
watchdog==3.0.0
109109
zipp==3.16.2
110110

111-
cohere~=4.19.2
111+
cohere~=4.19.2
112+
qdrant-client

0 commit comments

Comments
 (0)