Encoding sentence for machine learning doesn't finish : Forums : PythonAnywhere

Encoding sentence for machine learning doesn't finish

Hi all,

I'm loading a BERT model to use it for a semantic search service. Loading the models works fine and fast, but when my script is about to encode the input sentence it stops/doesn't continue.

Here is the part I'm talking about:

def getRecommends(text1):
print("calculate_recommends")
rel_questions = []
rel_links = []
query = text1
print("encode_sent")
query_vec = model.encode([query],show_progress_bar=True)[0] # <--- STOPS WORKING HERE
print("encoded_sent")
# compute normalized dot product as score
score = np.sum(query_vec * doc_vecs, axis=1) / np.linalg.norm(doc_vecs, axis=1)
topk_idx = np.argsort(score)[::-1][:4]
for idx in topk_idx:
    rel_questions.append(questions[idx])
    rel_links.append(answers[idx])
#for a in len(rel_questions):
print("calculated recommends")
return rel_questions, rel_links

I'm using the sentence_transformers module and a base model in this example.

Thank you :)

ozi | 4 posts | July 28, 2020, 7:27 p.m. | permalink

how long is it supposed to run for, and or is there an error message?

conrad | 21 posts | PythonAnywhere staff | July 29, 2020, 10:06 a.m. | permalink

Nothing special. It just takes extremely long to finish that step and then I get the harakiri message after 5 minutes. Is the computation too hard? I don't think encoding a sentence with a pretrained model should be a problem? Locally this step only takes a few seconds max.

ozi | 4 posts | July 29, 2020, 10:37 a.m. | permalink

What happens if you try to run it from a console? Does it take a really long time there too?

giles | 221 posts | PythonAnywhere staff | July 30, 2020, 9:43 a.m. | permalink

I'm sorry, I'm new to web design with flask. I can only run the code in the console to the point where the home website is set up and my encodings are loaded. When I run a query with user input on the running website, it get's stuck at the point of encoding the query. It takes too much time and the server get's in harrakiri mode. When I'm encoding the query with a TF-IDF model, it works fine. Is it possible, that the encoding procedure is too complex? It is a 400 MB transformer model.

ozi | 4 posts | Aug. 2, 2020, 2:03 p.m. | permalink

Tensorflow requires threads by default and we do not allow threads in web app code: https://help.pythonanywhere.com/pages/MachineLearningInWebsiteCode/ Have a look at https://help.pythonanywhere.com/pages/AsyncInWebApps/ for some ways that you can move processing out of web apps.

glenn | 279 posts | PythonAnywhere staff | Aug. 3, 2020, 9:57 a.m. | permalink

Thank you for the answer and explanation! Are there any plans to integrate a tensorflow support? If not, can you recommend other hosting alternatives where this should not be a problem (sorry to be blunt)?

ozi | 4 posts | Aug. 4, 2020, 10:17 a.m. | permalink

No, we don't have any plans in that direction at the moment.

glenn | 279 posts | PythonAnywhere staff | Aug. 4, 2020, 4:50 p.m. | permalink