I was discussing content moderation with a senior manager at a social media company of the recurring pattern in social media that even two well meaning people can escalate to name calling and flaming and whether there are scalable technology to handle de-escalation by circuit breaking hotly debated threads. I decided to use Classfy and sentiment analysis on a selected conversation to see how it predicts. If any linguistic major can weigh in on this it will be very helpful. So far I have found that paragraph neutrality does not mean neutrality at the sentence level. Here I cropped the code for privacy concerns.
ln[n] are sentences, parag is the whole paragraph and sentlnn are the sentiments for each sentence
I am not a linguistics major, but I would use sentence sentiment rather than paragraph. In your example at least they appear to be consistent. The average of the sentence sentiments is the same as the paragraph sentiment. You could also try a moving average of the sentence sentiment to detect positive / negative trends. For an example, see this Wolfram blog post.
I actually think (my first guess) the divergence might occur because people who are more deep stakeholders would examine sentiments by line while so so stakeholders the paragraph