当前位置：首页 > news >正文

宁波网站制作费用seo哪家强

news 2026/5/5 12:23:05

宁波网站制作费用,seo哪家强,淘宝客网站建设方案书,百度的竞价排名是哪种方式文章目录下载IMDb数据读取IMDb数据建立分词器将评论数据转化为数字列表让转换后的数字长度相同加入嵌入层建立多层感知机模型加入平坦层加入隐藏层加入输出层查看模型摘要训练模型评估模型准确率进行预测查看测试数据预测结果完整函数用RNN模型进行IMDb情感分析用LSTM模型进行… 文章目录下载IMDb数据读取IMDb数据建立分词器将评论数据转化为数字列表让转换后的数字长度相同加入嵌入层建立多层感知机模型加入平坦层加入隐藏层加入输出层查看模型摘要训练模型评估模型准确率进行预测查看测试数据预测结果完整函数用RNN模型进行IMDb情感分析用LSTM模型进行IMDb情感分析 GITHUB地址https://github.com/fz861062923/Keras 下载IMDb数据 #下载网站http://ai.stanford.edu/~amaas/data/sentiment/读取IMDb数据 from keras.preprocessing import sequence from keras.preprocessing.text import TokenizerC:\Users\admin\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 np.dtype(float).type.from ._conv import register_converters as _register_converters Using TensorFlow backend.#因为数据也是从网络上爬取的所以还需要用正则表达式去除HTML标签 import re def remove_html(text):rre.compile(r[^])return r.sub(,text)#观察IMDB文件目录结构用函数进行读取 import os def read_file(filetype):path./aclImdb/file_list[]positivepathfiletype/pos/for f in os.listdir(positive):file_list[positivef]negativepathfiletype/neg/for f in os.listdir(negative):file_list[negativef]print(filetype:,filetype,file_length:,len(file_list))label([1]*12500[0]*12500)#train数据和test数据中positive都是12500negative都是12500text[]for f_ in file_list:with open(f_,encodingutf8) as f:text[remove_html(.join(f.readlines()))]return label,text#用x表示label,y表示text里面的内容 x_train,y_trainread_file(train)filetype: train file_length: 25000x_test,y_testread_file(test)filetype: test file_length: 25000y_train[0]Bromwell High is a cartoon comedy. It ran at the same time as some other programs about school life, such as Teachers. My 35 years in the teaching profession lead me to believe that Bromwell High\s satire is much closer to reality than is Teachers. The scramble to survive financially, the insightful students who can see right through their pathetic teachers\ pomp, the pettiness of the whole situation, all remind me of the schools I knew and their students. When I saw the episode in which a student repeatedly tried to burn down the school, I immediately recalled ......... at .......... High. A classic line: INSPECTOR: I\m here to sack one of your teachers. STUDENT: Welcome to Bromwell High. I expect that many adults of my age think that Bromwell High is far fetched. What a pity that it isn\t!建立分词器具体用法可以参看官网https://keras.io/preprocessing/text/ tokenTokenizer(num_words2000)#建立一个有2000单词的字典 token.fit_on_texts(y_train)#读取所有的训练数据评论按照单词在评论中出现的次数进行排序前2000名会列入字典#查看token读取多少文章 token.document_count25000将评论数据转化为数字列表 train_seqtoken.texts_to_sequences(y_train) test_seqtoken.texts_to_sequences(y_test)print(y_train[0])Bromwell High is a cartoon comedy. It ran at the same time as some other programs about school life, such as Teachers. My 35 years in the teaching profession lead me to believe that Bromwell Highs satire is much closer to reality than is Teachers. The scramble to survive financially, the insightful students who can see right through their pathetic teachers pomp, the pettiness of the whole situation, all remind me of the schools I knew and their students. When I saw the episode in which a student repeatedly tried to burn down the school, I immediately recalled ......... at .......... High. A classic line: INSPECTOR: Im here to sack one of your teachers. STUDENT: Welcome to Bromwell High. I expect that many adults of my age think that Bromwell High is far fetched. What a pity that it isnt!print(train_seq[0])[308, 6, 3, 1068, 208, 8, 29, 1, 168, 54, 13, 45, 81, 40, 391, 109, 137, 13, 57, 149, 7, 1, 481, 68, 5, 260, 11, 6, 72, 5, 631, 70, 6, 1, 5, 1, 1530, 33, 66, 63, 204, 139, 64, 1229, 1, 4, 1, 222, 899, 28, 68, 4, 1, 9, 693, 2, 64, 1530, 50, 9, 215, 1, 386, 7, 59, 3, 1470, 798, 5, 176, 1, 391, 9, 1235, 29, 308, 3, 352, 343, 142, 129, 5, 27, 4, 125, 1470, 5, 308, 9, 532, 11, 107, 1466, 4, 57, 554, 100, 11, 308, 6, 226, 47, 3, 11, 8, 214]让转换后的数字长度相同 #截长补短让每一个数字列表长度都为100 _trainsequence.pad_sequences(train_seq,maxlen100) _testsequence.pad_sequences(test_seq,maxlen100)print(train_seq[0])[308, 6, 3, 1068, 208, 8, 29, 1, 168, 54, 13, 45, 81, 40, 391, 109, 137, 13, 57, 149, 7, 1, 481, 68, 5, 260, 11, 6, 72, 5, 631, 70, 6, 1, 5, 1, 1530, 33, 66, 63, 204, 139, 64, 1229, 1, 4, 1, 222, 899, 28, 68, 4, 1, 9, 693, 2, 64, 1530, 50, 9, 215, 1, 386, 7, 59, 3, 1470, 798, 5, 176, 1, 391, 9, 1235, 29, 308, 3, 352, 343, 142, 129, 5, 27, 4, 125, 1470, 5, 308, 9, 532, 11, 107, 1466, 4, 57, 554, 100, 11, 308, 6, 226, 47, 3, 11, 8, 214]print(_train[0])[ 29 1 168 54 13 45 81 40 391 109 137 13 57 1497 1 481 68 5 260 11 6 72 5 631 70 6 15 1 1530 33 66 63 204 139 64 1229 1 4 1 222899 28 68 4 1 9 693 2 64 1530 50 9 215 1386 7 59 3 1470 798 5 176 1 391 9 1235 29 3083 352 343 142 129 5 27 4 125 1470 5 308 9 53211 107 1466 4 57 554 100 11 308 6 226 47 3 118 214]_train.shape(25000, 100)加入嵌入层将数字列表转化为向量列表(为什么转化建议大家都思考一哈) from keras.models import Sequential from keras.layers.core import Dense,Dropout,Activation,Flatten from keras.layers.embeddings import EmbeddingmodelSequential()model.add(Embedding(output_dim32,#将数字列表转换为32维的向量input_dim2000,#输入数据的维度是2000因为之前建立的字典有2000个单词input_length100))#数字列表的长度为100 model.add(Dropout(0.25))建立多层感知机模型加入平坦层 model.add(Flatten())加入隐藏层 model.add(Dense(units256,activationrelu)) model.add(Dropout(0.35))加入输出层 model.add(Dense(units1,#输出层只有一个神经元输出1表示正面评价输出0表示负面评价activationsigmoid))查看模型摘要 model.summary()_________________________________________________________________ Layer (type) Output Shape Param # embedding_1 (Embedding) (None, 100, 32) 64000 _________________________________________________________________ dropout_1 (Dropout) (None, 100, 32) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 3200) 0 _________________________________________________________________ dense_1 (Dense) (None, 256) 819456 _________________________________________________________________ dropout_2 (Dropout) (None, 256) 0 _________________________________________________________________ dense_2 (Dense) (None, 1) 257 Total params: 883,713 Trainable params: 883,713 Non-trainable params: 0 _________________________________________________________________训练模型 model.compile(lossbinary_crossentropy,optimizeradam,metrics[accuracy])train_historymodel.fit(_train,x_train,batch_size100,epochs10,verbose2,validation_split0.2)Train on 20000 samples, validate on 5000 samples Epoch 1/10- 21s - loss: 0.4851 - acc: 0.7521 - val_loss: 0.4491 - val_acc: 0.7894 Epoch 2/10- 20s - loss: 0.2817 - acc: 0.8829 - val_loss: 0.6735 - val_acc: 0.6892 Epoch 3/10- 13s - loss: 0.1901 - acc: 0.9285 - val_loss: 0.5907 - val_acc: 0.7632 Epoch 4/10- 12s - loss: 0.1066 - acc: 0.9622 - val_loss: 0.7522 - val_acc: 0.7528 Epoch 5/10- 13s - loss: 0.0681 - acc: 0.9765 - val_loss: 0.9863 - val_acc: 0.7404 Epoch 6/10- 13s - loss: 0.0486 - acc: 0.9827 - val_loss: 1.0818 - val_acc: 0.7506 Epoch 7/10- 14s - loss: 0.0380 - acc: 0.9859 - val_loss: 0.9823 - val_acc: 0.7780 Epoch 8/10- 17s - loss: 0.0360 - acc: 0.9860 - val_loss: 1.1297 - val_acc: 0.7634 Epoch 9/10- 13s - loss: 0.0321 - acc: 0.9891 - val_loss: 1.2459 - val_acc: 0.7480 Epoch 10/10- 14s - loss: 0.0281 - acc: 0.9899 - val_loss: 1.4111 - val_acc: 0.7304评估模型准确率 scoresmodel.evaluate(_test,x_test)#第一个参数为feature,第二个参数为label25000/25000 [] - 4s 148us/stepscores[1]0.80972进行预测 predictmodel.predict_classes(_test)predict[:10]array([[1],[0],[1],[1],[1],[1],[1],[1],[1],[1]])#转换成一维数组 predictpredict.reshape(-1) predict[:10]array([1, 0, 1, 1, 1, 1, 1, 1, 1, 1])查看测试数据预测结果 _dict{1:正面的评论,0:负面的评论} def display(i):print(y_test[i])print(label真实值为:,_dict[x_test[i]],预测结果为:,_dict[predict[i]])display(0)I went and saw this movie last night after being coaxed to by a few friends of mine. Ill admit that I was reluctant to see it because from what I knew of Ashton Kutcher he was only able to do comedy. I was wrong. Kutcher played the character of Jake Fischer very well, and Kevin Costner played Ben Randall with such professionalism. The sign of a good movie is that it can toy with our emotions. This one did exactly that. The entire theater (which was sold out) was overcome by laughter during the first half of the movie, and were moved to tears during the second half. While exiting the theater I not only saw many women in tears, but many full grown men as well, trying desperately not to let anyone see them crying. This movie was great, and I suggest that you go see it before you judge. label真实值为: 正面的评论预测结果为: 正面的评论完整函数 def review(input_text):input_seqtoken.texts_to_sequences([input_text])pad_input_seqsequence.pad_sequences(input_seq,maxlen100)predict_resultmodel.predict_classes(pad_input_seq)print(_dict[predict_result[0][0]])#IMDB上面找的一段评论进行预测 review( Going into this movie, I had low expectations. Id seen poor reviews, and I also kind of hate the idea of remaking animated films for no reason other than to make them live action, as if thats supposed to make them better some how. This movie pleasantly surprised me!Beauty and the Beast is a fun, charming movie, that is a blast in many ways. The film very easy on the eyes! Every shot is colourful and beautifully crafted. The acting is also excellent. Dan Stevens is excellent. You can see him if you look closely at The Beast, but not so clearly that it pulls you out of the film. His performance is suitably over the top in anger, but also very charming. Emma Watson was fine, but to be honest, she was basically just playing Hermione, and I didnt get much of a character from her. She likes books, and shes feisty. Thats basically all I got. For me, the one saving grace for her character, is you can see how much fun Emma Watson is having. Ive heard interviews in which shes expressed how much shes always loved Belle as a character, and it shows.The stand out for me was Lumieré, voiced by Ewan McGregor. He was hilarious, and over the top, and always fun! He lit up the screen (no pun intended) every time he showed up!The only real gripes I have with the film are some questionable CGI with the Wolves and with a couple of The Beasts scenes, and some pacing issues. The film flows really well, to such an extent that in some scenes, the camera will dolly away from the character its focusing on, and will pan across the countryside, and track to another, far away, with out cutting. This works really well, but a couple times, the film will just fade to black, and its quite jarring. It happens like 3 or 4 times, but its really noticeable, and took me out of the experience. Also, they added some stuff to the story that I dont want to spoil, but I dont think it worked on any level, story wise, or logically.Overall, its a fun movie! I would recommend it to any fan of the original, but those who didnt like the animated classic, or who hate musicals might be better off staying aw)正面的评论review( This is a horrible Disney piece of crap full of completely lame singsongs, a script so wrong it is an insult to other scripts to even call it a script. The only way I could enjoy this is after eating two complete space cakes, and even then I would prefer analysing our wallpaper!)负面的评论到这里用在keras中用多层感知机进行情感预测就结束了反思实验可以改进的地方除了神经元的个数还可以将字典的单词个数设置大一些原来是2000数字列表的长度maxlen也可以设置长一些用RNN模型进行IMDb情感分析使用RNN的好处也可以思考一哈 from keras.models import Sequential from keras.layers.core import Dense,Dropout,Activation from keras.layers.embeddings import Embedding from keras.layers.recurrent import SimpleRNNmodel_rnnSequential()model_rnn.add(Embedding(output_dim32,input_dim2000,input_length100)) model_rnn.add(Dropout(0.25))model_rnn.add(SimpleRNN(units16))#RNN层有16个神经元model_rnn.add(Dense(units256,activationrelu)) model_rnn.add(Dropout(0.25)) model_rnn.add(Dense(units1,activationsigmoid))model_rnn.summary()_________________________________________________________________ Layer (type) Output Shape Param # embedding_1 (Embedding) (None, 100, 32) 64000 _________________________________________________________________ dropout_1 (Dropout) (None, 100, 32) 0 _________________________________________________________________ simple_rnn_1 (SimpleRNN) (None, 16) 784 _________________________________________________________________ dense_1 (Dense) (None, 256) 4352 _________________________________________________________________ dropout_2 (Dropout) (None, 256) 0 _________________________________________________________________ dense_2 (Dense) (None, 1) 257 Total params: 69,393 Trainable params: 69,393 Non-trainable params: 0 _________________________________________________________________model_rnn.compile(lossbinary_crossentropy,optimizeradam,metrics[accuracy]) train_historymodel_rnn.fit(_train,x_train,batch_size100,epochs10,verbose2,validation_split0.2)Train on 20000 samples, validate on 5000 samples Epoch 1/10- 13s - loss: 0.5200 - acc: 0.7319 - val_loss: 0.6095 - val_acc: 0.6960 Epoch 2/10- 12s - loss: 0.3485 - acc: 0.8506 - val_loss: 0.4994 - val_acc: 0.7766 Epoch 3/10- 12s - loss: 0.3109 - acc: 0.8710 - val_loss: 0.5842 - val_acc: 0.7598 Epoch 4/10- 13s - loss: 0.2874 - acc: 0.8833 - val_loss: 0.4420 - val_acc: 0.8136 Epoch 5/10- 12s - loss: 0.2649 - acc: 0.8929 - val_loss: 0.6818 - val_acc: 0.7270 Epoch 6/10- 14s - loss: 0.2402 - acc: 0.9035 - val_loss: 0.5634 - val_acc: 0.7984 Epoch 7/10- 16s - loss: 0.2084 - acc: 0.9190 - val_loss: 0.6392 - val_acc: 0.7694 Epoch 8/10- 16s - loss: 0.1855 - acc: 0.9289 - val_loss: 0.6388 - val_acc: 0.7650 Epoch 9/10- 14s - loss: 0.1641 - acc: 0.9367 - val_loss: 0.8356 - val_acc: 0.7592 Epoch 10/10- 19s - loss: 0.1430 - acc: 0.9451 - val_loss: 0.7365 - val_acc: 0.7766scoresmodel_rnn.evaluate(_test,x_test)25000/25000 [] - 14s 567us/stepscores[1]#提高了大概两个百分点0.82084用LSTM模型进行IMDb情感分析 from keras.models import Sequential from keras.layers.core import Dense,Dropout,Activation from keras.layers.embeddings import Embedding from keras.layers.recurrent import LSTMmodel_lstmSequential()model_lstm.add(Embedding(output_dim32,input_dim2000,input_length100)) model_lstm.add(Dropout(0.25))model_lstm.add(LSTM(32))model_lstm.add(Dense(units256,activationrelu)) model_lstm.add(Dropout(0.25)) model_lstm.add(Dense(units1,activationsigmoid))model_lstm.summary()_________________________________________________________________ Layer (type) Output Shape Param # embedding_2 (Embedding) (None, 100, 32) 64000 _________________________________________________________________ dropout_3 (Dropout) (None, 100, 32) 0 _________________________________________________________________ lstm_1 (LSTM) (None, 32) 8320 _________________________________________________________________ dense_3 (Dense) (None, 256) 8448 _________________________________________________________________ dropout_4 (Dropout) (None, 256) 0 _________________________________________________________________ dense_4 (Dense) (None, 1) 257 Total params: 81,025 Trainable params: 81,025 Non-trainable params: 0 _________________________________________________________________scoresmodel_rnn.evaluate(_test,x_test)25000/25000 [] - 13s 522us/stepscores[1]0.82084可以看出和RMNN差不多这可能因为事评论数据的时间间隔不大不能充分体现LSTM的优越性

查看全文

http://www.hkea.cn/news/14541415/