You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

92 lines
5.6 KiB
Plaintext

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

00:01 One of the most important things you can do for performance
00:04 in your database and these document databases
00:06 is think about your document design,
00:08 should you embed stuff, should you not, what embeds where,
00:11 do you embed just ids, do you embed the whole thing;
00:14 all of these are really important questions
00:16 and it takes a little bit of experience to know what the right thing to do is.
00:20 It also really depends on your application's use case,
00:24 so something that's really obviously a thing we should consider
00:28 is this service history thing, this adds the most weight to these car objects,
00:34 so we've got this embedded document list field
00:38 so how often do we need these histories?
00:44 How many histories might a car have?
00:46 Should those maybe be in a separate collection
00:49 where it has all the stuff that service record, the class has,
00:52 plus car id, or something to that effect?
00:56 So this is a really important question,
00:59 and it really depends on how we're using this car object, this car document
01:05 if almost all the time we want to work with the service history,
01:07 it's probably good to go in and put it here,
01:10 unless these can be really large or something to that effect,
01:13 but if you don't need them often, you'll consider putting them in their own collection,
01:16 there's just a tension between complexity and separation,
01:20 safety and separation, speed of having them in separate
01:24 so you don't pull them back all the time;
01:26 you can also consider using the only keyword or only operator in MongoEngine
01:30 to say if I don't need it, exclude the service history,
01:34 it adds a little bit of complexity because you often know,
01:38 hey is this the car that came with service history
01:40 or is it a car where that was excluded, things like that,
01:42 but you could use performance profiling and tuning
01:45 to figure out where you might use only.
01:48 Let's look at one more thing around document design.
01:50 You want to consider the size of the document,
01:52 remember MongoDB has a limit on how large these documents can be,
01:56 that's 16 MB per record, that doesn't mean you should think
02:01 oh it's only 10 MB so everything is fine for my document design,
02:05 that might be terrible this is like a hard upper bound,
02:07 like the database stops working after it hits 16 MB,
02:11 so you really want to think about what is the right size,
02:14 so let's look at a couple examples:
02:16 we can go to any collection and say .stats
02:18 and it will talk about the size of the documents and things like that,
02:21 so here we ran db.cars.stats in MongoEngine,
02:25 and we see that the average object size is about 700 bytes,
02:29 there is information about how many there are, and all that kind of stuff,
02:33 but really the most interesting thing for this discussion is
02:35 what is the average object size, 700 bytes
02:38 that seems like a pretty good size to me, it's not huge by any means,
02:42 and this is the cars that contain those service histories,
02:45 so this is probably fine for what we're doing.
02:48 Let me give you a more realistic example.
02:50 Let's think about the Talk Python Training website,
02:52 and the courses and chapters, we talked about them before,
02:56 so here if we run that same thing, db.courses.stats
03:02 you can see that the average object size is 900 bytes for a course,
03:07 and remember the course has the description that shows on the page
03:10 and that's probably most the size, it has a few other things as well,
03:13 like student testimonials and whatnot,
03:16 but basically it's the description and a few hyperlinks.
03:19 So I think this is again a totally good object, average object size.
03:23 Now one of the considerations was I could have taken the chapters
03:27 which themselves contain all the lectures,
03:29 and embedded those within the course,
03:32 would that have been a good idea—
03:34 I think I might have even had it created that way
03:36 in the very beginning, and it was a lot slower than I was hoping for,
03:38 so I redesigned the documents.
03:40 If we run this on this chapter section, you can see
03:43 that the average object size is 2.3 KB,
03:46 this is starting to get a little bit big, on its own it's fine,
03:50 but think about the fact that a course on average has like 10 to 20 chapters,
03:55 so if I embedded the chapters in the course
03:58 instead of putting them to a separate document like I do,
04:02 this is how it actually runs at the time of the recording,
04:04 then it would be something like these courses would be
04:07 24 up to maybe 50 KB of data per entry,
04:12 think about that you go to like the courses page
04:15 and it shows you a big list of all the courses
04:17 and there might be 10 or later 20 courses,
04:20 we're pulling back and deserializing like megabytes of data
04:24 to render a really, really common page, that is probably not ok,
04:28 so this is why I did not embed the chapters and lectures inside the course,
04:34 I just said okay, this is the breaking point
04:37 I looked at the objects' size I looked at where the performance was
04:41 and I said you know what, really it's not that common
04:44 that we actually want more than one chapter at a time,
04:46 but it is common we want lectures, so it's probably the right partitioning,
04:51 but you build it one way, you try it, it doesn't work,
04:53 you just redesign your class structure, recreate the database and try it again,
04:57 but you do want to think about the average object size
05:00 and you can do it super easy with db.colection name.stats.