How can we optimize our electronic shopping experience?
We can probably agree that those of us familiar with electronic shopping have experience shopping on Amazon. Over the years, it has undoubtedly become one of the largest electronic shopping centers. So when we search for something, we can trust Amazon's search algorithms to provide us with the best available options based on our search terms. Let's talk about how we can easily use a real-time product recommender based on Amazon results and optimize our shopping experience.
The motivation behind the idea.
The main idea is to see how robust and easy to use we can make our checkout process. Let's say we want to buy a pair of headphones. When we search Amazon, we get a list of almost 25-30 products. Now every buyer has some preferences. The most important of these preferences are brand specification and price. In addition, buyers look at other factors such as product popularity, product rating, product rating, and best price match. Also, as users, we sometimes feel that it would be great if we could quickly see what specs some products offer and make a decision from an expert perspective.
Now the needs are different from one buyer to another. Let's say some buyers focus on reviews, some on reviews, and some on price. Wouldn't it be great if we could give the user the ability to choose what to focus on? So let's see how we can achieve these goals.
The idea
The idea is to assign a score to each product based on the fields that the user can view during the purchase. For example, we give a product a rating based on its popularity, a rating based on reviews, etc. We then calculate a weighted score based on user preferences.
Let's see the idea with a concrete example. Let's say we want to buy a pair of headphones. So we search and get a list of 25 items. We assign each product a variable rating x1 based on ratings, x2 based on popularity, x3 based on ratings, and x4 based on price restrictions. Now we ask the user if they have any preferences, e.g. B. Focus more on one aspect than others. For example, if the user wants to focus more on reviews, we calculate the total score as follows:
y = x1+x2+ax3+x4
This gives x3 more weight. If the user does not have these preferences, we can calculate this.
y = x1+x2+x3+x4
Here we default each value to a maximum of 1 to ensure a balanced weighting of all factors. We can then sort the products based on their value and to get our results.
At the same time, we will create a specification list of all products to help our users with specific requirements to get a feel for which product is best suited for them.
let's apply
Application
If we search the Amazon website, our options will appear as shown below.
We will rule out options offered by Amazon's search algorithm. We will do this with Selenium Webdriver.
If we take a close look at the search bar and figure it out, we can easily frame our direct search URL that helps us reach the Amazon search page.
import right
hacer Selenium Import Webdriver
from selenium.webdriver.common.by import by
since selenium.webdriver.support.ui import WebDriverWait
In selenium.webdriver.support, import the expected conditions as EC
from import options selenium.webdriver.chrome.options
aus bs4 importado BeautifulSoupimport requests
def search_am(expression):
link="https://www.amazon.in/s?k="
l_end="&ref=nb_sb_noss"
frase_w=frase.replace(' ','+')
link_full=enlace+frase_w+l_end
#print(link_full)driver = webdriver.Chrome()
wait = WebDriverWait(controller, 5)
motorista.get(link_full)f_names=[]
nomes=driver.find_elements_by_tag_name("a")
yo=0
for names in names:
className = name.get_attribute('class')
if className=='a-link-normal a-text-normal':
nomes_f.append(Number)
i+=1links=[]
for i in f_names:
temp= i.get_attribute('href')
enlaces.append (temporal)See AlsoBuilding a recommendation system for Amazon products with PythonWhat is Python for? A Beginner's GuideMost Popular Python Packages in 2021Cartesian product of lists in Python (itertools.product) | nota.nkmk.medriver.exit()
backlinks
The above function helps us to pull all the links for all the products listed on the search page and return the links.
Each link takes us to a specific product page.
Next, let's focus on some parts of the product page.
- The classification area:
This part mentions the classification of the product.
2. The popularity section: Here I used the number of reviews as a measure of popularity.
3. The Prices Section: This section indicates the price of the product.
4. The Specifications section – Lists all the specifications and details of the product.
5. The rating section: This section reflects the ratings of the products.
Now if we focus here in bold, there is a statement for each classification. This row provides a high-level overview of the analysis. We'll take those sentences, judge the humor, and assign the qualifying point.
First, let's extract the necessary details from the product pages.
def get_element_dets(enlace):
driver = webdriver.Chrome()
wait = WebDriverWait(controller, 2)
biker.get (link)
title_o= driver.find_elements_by_id("Producto")
title=title_o[0].textnumero_o= conductor.find_elements_by_id("acrCustomerReviewText")
attempt:
popularity=(number_o[0].text)
except:
Popularity='0'
rate=driver.find_elements_by_css_selector("#reviewsMedley > div > div.a-fixed-left-grid-col.a-col-left > div.a-section.a-spacing-none.a-spacing-top-mini. cr-widget-ACR > div.a-fixed-left-grid.AverageCustomerReviews.a-spacing-small > div > div.a-fixed-left-grid-col.aok-align-center.a-col-right > div > Spanne > Spanne")
attempt:
rate_o=(rate[0].text).split(' ')[0]
except:
tasa_o='0'
feat_f=[]
tag=[]
Wert=[]
#features=driver.find_elements_by_css_selector("#feature-bullets > ul > li > span")
# for f in functions:
# feat_f.append(f.texto)
Pres=0
attempt:
tag_o=driver.find_elements_by_tag_name('th')
for name in tag_o:
className = name.get_attribute('class')
if className=='a-color-secundario a-size-base prodDetSectionEntry':
label.append(name.text)
value_o=driver.find_elements_by_tag_name('td')
for name in value_o:
className = name.get_attribute('class')
if className=='a-size-base':
value.append(name.text)
yo=0
while i<len(valor):
t=str(Etiqueta[i])+':'+str(Wert[i])
feat_f.append(t)
i+=1
except:
feat_f=[':']
attempt:
price_o= driver.find_elements_by_id("priceblock_ourprice")
by name in price_o:
className = name.get_attribute('class')
if className=='a-size-medium a-color-price priceBlockBuyingPriceString':
price=(name.text)
romper
except:
Pres=0
#price=price_or.text
feedbacks=driver.find_elements_by_tag_name("a")
feedback_f=[]
for power feedbacks:
class name = feed.get_attribute('class')
if className=='a-size-base a-link-normal review-title a-color-base review-title-content a-text-negrita':
feedback_f.append(feed.text)
driver.exit()
back feedback_f,title,rate_o,popularity,performance_f,price
The above code snippet helps to clean up all the required details of the product pages and returns feedback on the product title, rate, popularity and other required values.
caller def (phrase):
links=search_am(expression)
data={}
print(len(links))
para link-in-links:
Data[link]={}
feedback_f,title,rate,popularity,feat_f,price=get_element_dets(link)
data[link]['feedback']=feedback_f
data[link]['title']=Title
data[link]['rate']=rate
data[link]['popularity']=Popularity
data[link]['feats']=feat_f
if it is instance(price, int):
data[link]['price']=PriceAnders:
See Alsoexplains why Python continues to grow The GitHub BlogParsing Orders in Python7 Python code examples for everyday usePython Dot Product and Cross Product - Python Guidesdata[link]['price']=price.split(' ')[1]
#print(length(data))
return data
The above snippet helps organize all the products and their corresponding features in a dictionary format.
The key of any product is its link, and consequently all the resources in a nested dictionary as key-value pairs.
Amazon pages rarely have any variation in tags, but it's best to use Try and Exception blocks to handle errors just in case.
Now that we have discarded all the necessary data, we can start assigning scores.
System based on popularity and rating
def Assign_Popularity_Rating():
mit open('products.json', 'r') as open file:data = json.load(open file)
temperature = 0
para k a data.keys():
p=int(data[k]['popularity'].split(' ')[0])
r=float(data[k]['rate'])
with p<50:
temperature = 1
elif p<100:
temperature = 2
elif p<150:
temperature = 3
Anders:
temperature = 4
score = (temperature)
data [ k ][ 'Popularity_Score' ]=Score
data[k]['Score_score']=r
mit open("products_mod.json", "w") als Outfile:
json.dump(data, output file)
The above code is used to assign each product a score based on a population and rating. For the population, I used a classification or class approach. We already have the valuations of the amounts we receive in the scraping.
Check out the mood-based system
from textblob import textblob
def Assign_Sentiment_Rating():
mit open('products_mod.json', 'r') as open file:data = json.load(open file)
sm=0
para k a data.keys():
temp=data[k]['feedback']
z=0
sm=0
for me in temperature:
#print(me)
z+=1
t=TextBlob(i).feeling.polarity
#print(t)
sm+=t
con (z==0):
Rating = 0
Anders:
Score = sm/z
data[k]['Review_Score']=Rating
mit open("products_mod_2.json", "w") as output file:
json.dump(data, output file)
To detect the sentiment polarity, I used the sentiment polarity function from the TextBlob libraries. Assigns a value from -1 to +1 based on the sentiment detected in the ratings. We have multiple reviews of a product, so we have a value for each review. We then add up all the values obtained from all the tests and divide by the number of tests to try to keep the total score less than or equal to -1. So we repeat the process for each product and get the rating for each product.
price relevance system
def check_price_relevence():
mit open('products_mod_2.json', 'r') as open file:data = json.load(open file)
print("Enter the approximate price for the tuning search")
price=int(entry())
print("Specify a margin")
margin = int(input())
para k a data.keys():
data_ref=str(data[k]['price']).replace(',','')
temp=float(data_ref)
if temp<price+margin and temp>price-margin:
Rating = 1
Anders:
Rating = 0
data[k]['Price_relevence_Score']=Bewertung
with open("products_mod_3.json", "w") as output file:
json.dump(data, output file)
This is our price relevance function. Ask for an approximate price and a margin to compare. It then compares the price of the products and the assortment to assign the relevance score.
After assigning all the scores, our lexicon adds the following for each product.
collect specs
We then create a CSV or Excel file with the specifications of all listed products for our customers with specific requirements.
import pandas as pd
def form_featureset():
mit open('products_mod_3.json', 'r') as open file:data = json.load(open file)
feat=[]
set_c=[]
para k a data.keys():
temp=data[k]['resources']
temp2=[]
for me in temperature:
label=i.split(':')[0]
if the tag is not in the exploit:
feat.append(tag)
#print(feat)
para k a data.keys():
temp=data[k]['resources']
temp2=[-1]*len(feat)
for me in temperature:
label=i.split(':')[0]
#print(label)
ind= feat.index(label)
#print(indicate)
temp2[ind]= i.split(':')[1]
set_c.append(temp2)
df=pd.DataFrame(set_c,columns=feat)
df.to_csv('product_descriptions.csv',index=False)
return df
This snippet generates a data frame with all products and their specs listed to provide a view of the available specs.
The generated specification tables are as follows. I put -1 for values that were not specified on the product pages.
These tables are intended to assist customers in their comparative search for specifications.
weighted score
def tune_search(choice):mit open('products_mod_3.json', 'r') as open file:data = json.load(open file)
para k a data.keys():
precio_rel=data[k]['Price_relevence_Score']
review_score=datos[k]['Review_Score']
pop_score=datos[k]['Popularity_Score']
pop_score_k=pop_score/4rate_score=datos[k]['Rating_Score']
rate_score_k=rate_score/5if you choose == 1:
total_score=5*pop_score_k+rate_score_k+review_score+price_rel
if you choose == 2:
total_score=pop_score_k+5*rate_score_k+review_score+price_rel
if you choose == 3:
total_score=pop_score_k+rate_score_k+review_score+5*price_rel
if you choose == 4:
total_score=pop_score_k+rate_score_k+5*review_score+price_relAnders:
total_score=pop_score_k+rate_score_k+review_score+price_reldata[k]['Total_score']=total_score
#print(data[k]['Total_score'])
links=sort_d(data)backlinks
This code snippet returns a score based on the user's selection. I used a very basic conditional approach. I divided the scores by 5 and the population by 4 to keep the scores between 0 and 1. Here the weight value is set to 5. It's just a random selection.
This is our code here.
Application
The provided video demonstrates the application.
Whatever the choice, I have also provided the links to other options just to give the user more convenience and the ability to try other options.
In my case, you see the chrome window, even though it is automated and closes itself, it still appears, you can prevent it from starting headless chrome and use chromeoptions() in the chrome controller.
I'm looking forward to
The application can be modified or made more robust in two ways, but both require data that is not currently available (that I know of).
- Once Amazon's current sentiment data sets are available, we can create our own sentiment classifier, where we can place other classes along with positive and negative sentiment. This will help make our review-based rating stronger.
- If the data or spec is categorically in place, we can create our own built-in resource for the spec. For example, if we have enough instances of laptop data, we can create an embedding or coding room for laptops. We can represent any laptop as an embedded vector of its specifications. We can create a new specification vector according to user requirements. There we can apply the K-Nearest Neighbors algorithm to obtain the closest K-vectors to the requirements vector and order them according to their Euclidean distances. Thus, we can obtain k laptops with specifications close to the user's needs. This allows us to add relevance to spec-based scoring to make our system more robust.
Diploma
See how we can build a real-time product recommendation engine in Python in just a few steps.
Here it isGithub link.
I hope that helps.